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Abstract 

Error-correcting codes and related combinatorial constructs play an important role in several 
recent (and old) results in computational complexity theory. In this paper we survey results on 
locally-testable and locally-decodable error-correcting codes, and their applications to complex- 
ity theory and to cryptography. 

Locally decodable codes are error-correcting codes with sub-linear time error-correcting al- 
gorithms. They are related to private information retrieval (a type of cryptographic proto- 
col), and they are used in average-case complexity and to construct "hard-core predicates" for 
one-way permutations. Locally testable codes are error-correcting codes with sub-linear time 
error-detection algorithms, and they are the combinatorial core of probabilistically checkable 
proofs. 
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1 Introduction 

Many recent (and not-so-recent) results in complexity theory rely on error-correcting codes. The use 
of coding-theoretic concepts, constructions and algorithms has been a major theme of complexity 
theoretic results in the past few years, and so has been the re-interpretation of older results in a 
coding-theoretic language. 

An error-correcting code is a mapping C : {0, 1}*^ — > {0, 1}" (or, more generally, C : — > P" 
where S and P are finite sets) with the property that if we are given a string y that is "close" to a 
valid encoding C{x)^ then it is possible to find out the message x from the "corrupted encoding" 
y. Towards this goal, it is necessary and sufficient that for any two different messages their 
encodings C{x) and C(x') differ in a lot of coordinates. Error-correcting codes are motivated by 
the task of reliably sending information over noisy channel. In such an application, the sender has 
a message x, he computes and sends C{x) over the channel, because of noise the receiver receives a 
string y that differs from C(x) in a few coordinates (the ones where transmissions errors occurred), 
but if the number of errors is bounded then the receiver can still reconstruct x from y. 

1.1 Early Uses of Error-Correcting Codes in Cryptography and Complexity 

A natural application of error-correcting codes in computational complexity is to the setting of fault- 
tolerant computation. In one natural model of fault-tolerant computation, we want to compute a 
boolean function using a circuit, and each gate of the circuit has a small probability e of failing 
(and producing a wrong output). We would like to construct a "fault-tolerant" circuit that, even 
in the presence of these errors, will have a reasonably high probability of computing the function 
correctly. (In this model one typically assume that the failures of different gates are mutually 
independent events.) This problem was introduced by von Neumann |vN56j . who suggested that 
error-correcting codes could be applied to it. Low-density parity-check codes were applied to 
compute linear functions |Eli58| |Tay68| in variants of this model and general functions |Pap85| in 
the general model. 

Another early application of error-correcting codes to cryptography was Shamir's secret sharing 
scheme |Sha79j . which can be seen as an application of Reed-Solomon codes. ^ A different use of 
coding theory for secret sharing is in |BOGW88] and in subsequent work on the "information- 
theoretic" model of security for multi-party computations. 

Finally, we mention that McEliece's cryptosystem |McE78j is based on the conjectured in- 
tractability of certain coding-theoretic problems. The study of the complexity of coding-theoretic 
problem is clearly an important source of interaction between coding theory and complexity the- 
ory, but in this paper we will restrict ourselves to the use of algorithmic coding-theoretic results in 
complexity theory. 

^meaning that y and C(x) differ in a small number of coordinates. 
^This connection was first noticed by McEliece and Sarwate |MS81| . 
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1.2 Error-Correcting Codes and Average-Case Complexity 



A paper by Levin |Lev87j contains one of the earliest use of coding theory in order to prove an 
average-case complexity result. The goal of the paper is to construct pseudorandom generators 
from certain one-way functions and a preliminary step is to construct "hard-core predicates" for 
such functions.'^ Without getting too technical, we can abstract the use of error-correcting codes 
in |Lev87j as follows: (i) there is a computational problem P that is presumably hard to solve on a 
certain set of inputs; (ii) we think of the right answers for P on those inputs as our "message" and 
we encode it with an error-correcting code; (iii) we define a new computational problem P' , which 
is to compute entries of the above encoding. The important idea is now to observe that if we have 
a good-on- average algorithm for P', that is, an algorithm that solves P' on all but a small fraction 
of inputs, we can think of the set of outputs of this algorithm as being a "corrupted" version of 
our encoding of P; using a decoding algorithm for our code we can now solve P correctly on all 
inputs, which contradicts our intractability assumption for P. In conclusion, from a problem P 
that was assumed to be worst-case hard and from an error-correcting code we have constructed a 
new problem P' that is average-case hard. 

The above outline skips an important point: presumably the complete description of P (and 
P') is an object of exponential size, while we would like our worst-case to average-case reduction 
to run in polynomial time, so that we want to use an error-correcting code for which decoding can 
be performed in poly-logarithmic time. 

Roughly speaking, in Levin's paper and in other similar cryptographic applications one applies 
an encoding "locally," to small pieces of the computational problem, so that it is not necessary to 
have a poly-logarithmic time decoder. As a consequence, the reduction relates a stronger form of 
average-case complexity to a weaker form (which is enough for these cryptographic applications) 
instead of relating average-case complexity to worst-case complexity (which is important in other 
complexity-theoretic applications) . 

Sections |31 and 0] are devoted to error-correcting codes having decoding algorithms running in 
poly-logarithmic (or even constant) time, and their applications to complexity theory and cryptog- 
raphy. Some of the applications follow the same line of reasoning sketched above. 



1.3 Program Testing, Hard-Core Bits, and Sub-linear Time Error-Correction 



Work done in the late 1980s and early 1990s on "hiding instances from oracles" |BF90j . on the 
self-reducibility of the permanent |Lip90| , and of PSPACE-complete and EXP-complete problems 
|FF93j . as well as work more explicitely focused on average-case complexity |BFJNW9,3j IS now 
seen as based on sub-linear time decoding algorithms for certain polynomial-based error-correcting 
codes, although this is a view that has become common only since the late 1990s. 

Such results were typically discussed in terms of self- correction, a notion introduced by Blum, 
Kannan, Lipton and Rubinfeld |BK89l Lip90 IBLR93| in the setting of program testing. 



Around the same time, Goldreich and Levin jGL89j introduced an efficient and general way 
of constructing hard-core predicates for one-way functions (the cryptographic problem mentioned 
above and extensively discussed in Section EJ. The Goldreich-Levin construction is now seen as a 
sub-linear time list-decoding algorithm for an error-correcting code, a perspective first suggested by 
Impagliazzo and Sudan in unpublished papers in the early 1990s. The coding-theoretic perspective 
is useful because it suggests that different, and possibly even more efficient, hard-core predicates can 

■^We extensively discuss hard-core predicates for one-way permutations in Section 0] 
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be constructed using different codes and different decoding algorithms.^ Improvements to |GL89j 
via tlie solution of other decoding problems are reported in [(TESnOj , with an explicit discussion of 
sub-linear time decoding. Recent work by Akavia, Goldwasser and Safra |AGS03j gives a coding- 
theoretic interpretation (along with generalizations and improvements) for other hard-core predicate 
constructions that previously seemed to require ad-hoc algebraic analyses and to be independent 
of coding theory. 

A paper by Babai et al. |BFLS9l| is probably the first one to explicitely discuss sub-linear time 
decoding algorithms for error-correcting codes, and their possible relevance in the classical setting 
of coding theory, that is, error-resistant storage and transmission of information.^ The relevance of 
sub-linear time decoding to average-case complexity, and the generality of the approach of using a 
code to encode the description of a computational problem, are pointed out explicitely in |STVnij . 
Katz and this author KTOO give the first negative results for codes with sub-linear time decoding 
algorithms and note that, besides their relation to hard-core predicates and average-case complexity, 
they are also related to private information retrieval |CGKS98] . a type of cryptographic protocol 
discussed in Section 



1.4 Program Testing and Locally Testable Codes 

Apart from Levin's work |Lev87j . which motivated |GL89j . most of the line of work described in the 
previous section can be traced to the work on program testing by Blum and Kannan |BK89j and 
Lipton Lip90| . Suppose that we are interested in computing a function /, and that we are given 



an algorithm A that may or may not be correct: is it possible to test the correctness of A "on the 
fly" while we run it? The approach proposed in J3K89; Lip90| was roughly as follows: to construct 



a self-testing procedure for / that, given an algorithm A, would accept with high probability if A 
solves / correctly on all inputs, and it would reject with high probability if A is incorrect on many 
inputs. (Note that the self-testing procedure may accept with high probability an algorithm that 
makes few mistakes.) An algorithm rejected by the self-tester would be discarded as buggy. If an 
algorithm A is accepted by the self-tester, then we would use A in connection with a self-corrector 
for /. A self-corrector for / is a procedure that given an algorithm A that solves / on many input, 
and given an arbitrary input x, computes f{x) with high probability. 

Sudan's PhD Thesis _ Sud92. is an early work that makes an explicit connection between self- 
testing and error-detection ^ and between self-correcting and error-correction. 

We note that self-correction, besides being related to error-correction, also relates to average- 
case complexity (a worst-case intractable problem that is self-correctable is also necessarily average- 
case intractable). Lipton |Lip90| presents a self-corrector that works for any function that can be 
expressed as a low-degree polynomial, and, in particular, is a self-corrector for the permanent. En- 
coding PSPACE-complete and EXP-complete problems using a polynomial-based encoding (which 
is called the Reed-MuUer code, as we will see in a later section), Feigenbaum and Fortnow |FF93j 
give self-correctors for certain PSPACE-complete and EXP-complete problems, and Babai et al 
|BFNW93] use these results to prove average-case complexity results for certain EXP-complete 
problems. Since the self-correction perspective is very natural, it took some time to see the con- 
structions of |KK93[rBFNW93] as being about error-correcting codes with sub-linear time decoding. 

^Such improvements were the focus of the manuscripts by Impagiiazzo and Sudan. 
^As we discuss below, known and conjectured negative results make such applications unlikely. 
®The error-detection problem for an error correcting code is to distinguish a valid encoding C{x) from a string y 
that is not a valid encoding of any message. 
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Just like self-correcting is strongly related to sub-linear time decoding of error-correcting codes, 
so self-testing is related to sub-linear time error-detection. The self-testing algorithms by Blum, 
Luby and Rubinfeld |BLR93j for linear functions and by Gemmel et al. |GLR+911 IrF^96| for 
polynomial functions can indeed be see as sub-linear time error-detection algorithms for certain 
error-correcting codes. Such testing algorithms played a fundamental role in the construction of 
probabilistically checkable proofs (PGP) |FGL+9l| IAS98| lALM+98j . which in turn revolutionized 
the study of approximation algorithms.'' 

Locally testable codes, that is, codes with sub-linear time error-detection algorithms, were soon 
recognized to be the combinatorial core of PGP construction, and the question of providing simpler 
and more efficient constructions of such codes was posed as an open question in various writings 
from the mid 1990s, such as |Aro94| Spi951 lFS95| IE,S96j . Great progress has been made towards 
such constructions in the past two years, with the latest results BSGH+Oil IDR 04j providing a 
more clarifying perspective on the relationship between these codes and PGP constructions. 



1.5 Further Reading 

Regarding coding theory in general, van Lint's book |vL99| is an excellent reference. Madhu Sudan's 
notes |Sudl ISudOlj are excellent introductions to algorithmic coding theory, and they are the main 
source that we used for our brief presentation of results in algorithmic coding theory in Section [3 
A survey on applications of coding theory to complexity theory was written by Joan Feigembaum 
|Fei95j about ten years ago. Many themes treated in |Fei95j are still current. Venkat Guruswami's 
thesis |(7urnij has a chapter on applications of coding theory to complexity and cryptography. 
A survey paper by Madhu Sudan SudOOj focuses on applications of list-decoding algorithms to 
complexity theory, including the applications to average-case complexity and hard-core predicates 
that we discuss in this paper. 



1.6 Organization of this Paper 

We start the paper with some review material on error-correcting codes and algorithmic coding 
theory. This material has wider applications than the ones that we chose to focus on in this paper. 

We then consider sub-linear time error-correction algorithms, their relation to private informa- 
tion retrieval, and their applications in average-case complexity and cryptography. 

Finally we discuss sub-linear time error-detection algorithms and their relation to PGP con- 
structions. 



2 Error-Correcting Codes 
2.1 Shannon's Setting 

A party called the sender has a message x € S'^ that he wants to send to another party called the 
receiver. Here E is a finite alphabet (often S = {0, 1}) and k is the message length. 

The sender and the receiver communicate through a noisy channel that introduces errors. To 
eliminate errors (or, at least, to dramaticallt reduce the probability of errors) the sender first 

^This is a story that is both too long and too exciting to be eflFectively summarized here. We try to give such a 
summary in Section 15.41 The reader should also refer to one of the many excellent survery papers on the subject, 
such as, for example, ,,Aro98, . 
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encodes his message using an encoding function C : T,^ ^ S" with n > k that introduces some 
redundancy, and then sends C(x) through the channel. The receiver receives a string y that is 
possibly different from C{x) because of transmission errors. The receives then feeds ?/ to a decoding 
algorithm D that, under some assumption about the error pattern introduced by the channel, is 
able to compute x. 

We would like to design efficient procedures C and D such that the above holds under general 
assumptions about the channel and with n not much larger than k. This setting was introduced 
by Shannon ;Sha48j in his monumental work that defined information theory. 

2.2 Error-Correcting Codes 

The Hamming Distance dnia,, b) between two strings a, 6 G S" is the number of entries i such that 
Oi / bi. 

An [n, k, d]q code is a function C : S" ^ such that 

• |S| = q; 

• For every x,x' £ S^, dH{C{x),C{x')) > d. 

The paramerer k is called the information length of the code and n is called the block length. 
Abusing terminology a little, we call d the minimum distance of the code.^ 

If a [n,k,d]q code admits a decoding procedure that is always able to correct e errors, then it 
must be d > 2e + 1. Conversely, if d > 2e + 1 then there is a (possibly not efficiently computable) 
decoding procedure that is able to correct up to e errors. 

Error-correcting codes, introduced by Hamming |Ham5flj . solve the coding problem in models 
where there is a upper bound to the number of errors introduced by the channel. Error-correcting 
codes can also be used in settings where we have a probabilistic model for the channel, provided that 
we can show that with high probability the number of errors introduced by the channel is smaller 
than the number of errors that the decoding procedure can correct. In the rest of this paper we 
only discuss error-correcting codes, but the reader can see that any algorithmic result about error- 
correcting codes implies an algorithmic solution to Shannon's problem for various distributions of 
errors. 

For a given k, we are interested in constructing [n,k,d]q codes where n is small (ideally, n = 
0{k)), d is large (ideally, we would like d = 0(n)) and q is small (ideally, S = {0, 1} and q = 2). 
Sometime we will call the ratio k/n the information rate (or just rate) of the code, which is the 
"amortized" number of alphabet elements of the message carried by each alphabet element sent 
over the channel. We will also call the ratio d/n the relative minimum distance of the code. 

2.3 Negative Results 

Before seeing constructions of error-correcting codes, let us start by seeing what kind of trade-offs 
are impossible between k, d and n. 

Suppose C is a [n,k,d]q code, and associate to each message x the set of strings Sx defined as 
the set of all strings y that agree with C{x) on the first n — d+1 coordinates. We claim that these 
sets are all disjoint. Otherwise, if we had y £ SxH Sx' we would have that y and C{x) agree in the 

^More precisely, the minimum distance of a code C is mmx^^i{dH{C{x),C{x')}, so that if C is an [n,k,d\q code, 
then d is a lower bound to the minimum distance. 
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first n — d + 1 coordinates, and so would y and C{x'), but then C{x) and C(x') would also have to 
agree on the firt n — d+1 coordinates and this would contradict the minimum distance requirement 
of the code. Now, we have disjoint sets each of size contained in a space of size q"', and 

so we have proved the following result. 

Lemma 1 (Singleton Bound) In a [n, k, d]q code, k < n — d + 1. 

As we will see later, this negative result can be matched if the size q of the alphabet is large 
enough compared to n. For smaller alphabets, however, stronger bounds are known. 

Lemma 2 (Plotkin's Bound) In a [n, k, d]q code, k < n — {q/{q — + logqii. 

For example, if g = 2, then the relative minimum distance d/n cannot be larger than 1/2, and 
for constant q it cannot be larger than 1 — 1/q. For proofs of these results, see for example |vL99j . 

2.4 Constructions of Error-Correcting Codes and Decoding Algorithms 

In this section we will consider various constructions of error-correcting codes. All these construc- 
tions will have the property of being linear, that is the alphabet S will be a field F, and the encoding 
function C : F'^ — > F" will be a linear function. 

If C is a linear code, then there is a matrix A such that the encoding function can be specified 
as C{x) = A - X. Also, there is a matrix H such that y is a codeword (that is, a possible output of 
C) if and only if if • y = 0, where is the all-zero vector. This means that for every linear code C 
there is always an encoding circuit of size at most quadratic (that simply computes A ■ x given x) 
and a circuit of size at most quadratic that solves the error- detection problem, that is, the problem 
of deciding whether a given string is a codeword or not. 

Let the weight of a vector y G F" be defined as the number of non-zero entries. (Equivalently, 
the weight of a vector is its Hamming distance from the all-zero vector.) Then it is easy to see that 
the minimum distance of a linear code is equal to the minimal weight of a non-zero codeword.^ 
This observation often simplifies the study of the minimum distance of linear codes. 

2.4.1 Random Error- Correcting Codes 

As a first example of linear error-correcting code, we see what happens if we pick at random a linear 
code over the field {0, 1}. In order to show that, with high probability, the code has large minimum 
distance, we show that, with high probability, all non-zero inputs are mapped into codewords with 
a large number of ones. This is easy to show because, for a random matrix A and a fixed non-zero 
vector X, the encoding A - x is uniformly distributed, and so it has a very low probability of having 
low weight. The argument is completed by using a union bound. The formal statement of the result 
and a sketch of the proof is below. This existence result is called the Gilbert- Varshamov bound 
because it was first proved by Gilbert |Gil52j for general random codes, and then Varshamov |Var57j 
observed that the same bond could be obtained by restricting oneself to random linear codes. 

Lemma 3 (Varshamov) For every 5 < 1/2 and every n there is a [n, Rn,5n]2 linear code such 
that 

R > 1 - H2{6)) - @{{logn)/n) 

^To be precise, one also needs to assume that the encoding function C is injective. Alternatively, one can see that 
the minimum distance is equal to the minimum weight of the encoding of a non-zero input. 
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Where H2{x) = xlog2(l/x) + (1 — x) log2(l/(l — x)) is the binary entropy function. 

Proof: We pick a hnear function C : {0, l}'^ {0, 1}" at random by picking at random a k x n 
0/1 matrix A and defining C{x) = Ax. 

We use the probabihstic method to show that there is a positive probabihty that every non-zero 
message is encoded into a string with at least d ones. 

For a particular x ^ 0^, we note that C{x) is uniformly distributed in {0, l}*^, and so 



Where we used the fact that there are 2"'-'^('^/"')+®('°s") binary strings of length n having weight k. 
A union bound shows that 



which is smaller than 1 under the assumption of the lemma. □ 

It is worth noting that the observation about minimum distance versus minimum weight plays 
a role in the proof. A proof that didn't use such a property would have considered the event that 
two possible inputs mapped into encodings of distance smaller than d, and then we would 

have taken a union bound over all such pairs. This would have let to a bound worse by a factor of 
two than the bound we achieved above. 

As an interesting special case, we have that for every e > there is a constant rate R = O(e^) 
such that for every n there is a [n,Rn, (1/2 — e) • n]2 linear code. That is, there are linear codes 
with constant rate and with relative minimum distance arbitrarily close to 1/2. (Recall that 
Plotkin's bounds does not permit codes with relative minimum distance strictly larger than 1/2.) 
More generally, if g is a prime power and e > then there is a constant rate R for which linear 
[n, Rn, {1 — 1/q — e) ■ n]q linear codes exist for all sufficiently large n. 

There is no known algorithm to decode random linear codes in polynomial time on average. It 
is, however, possible to solve the decoding problem for any linear code in exponential time by using 
brute force. 

2.4.2 Reed-Solomon Codes 

The next code we consider is based on the following well-known fact about (univariate) polynomials: 
a polynomial of degree t is either identically zero or it has < t roots. 

Encoding and Minimum Distance. In a Reed-Solomon code |R,S6nj we think of every message 
as representing a low-degree polynomial, and the encoding of the message is the n values that we 
get by evaluating of the polynomial at n fixed points. A more formal description follows. 

Let g be a prime power and Fg be a finite field of size q. Let us fix n distinct elements of Fg, 
xi, . . . ,Xn, and let k < n. We define a [n,k,n — k + l\q linear code as follows. 

Given a message (cq, . . . , c^-i), we interpret it as a description of the polynomial p{x) = cq + 
cix -|- . . . -|- Ck-ix^~^ ■ The encoding of such a message will be the vector (p(xi), . . . ,p{xn))- 

Such a procedure maps indeed a message of length k into an encoding of length n, and it is 
a linear mapping. To verify the claim about the minimum distance, if (cq, . . . , Cfc_i) is not the 




i=0 



Pr[3x / 0^= : w{C{x)) < d\ < 



2^ . 2" . '2P'H2{d/n)+0{\ogn) 
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all-zero vector, then the corresponding polynomial p is a non-zero polynomial of degree k — 1. Such 
a polynomial can have at most k — 1 roots, and so at lest n — (k — 1) of the values p{xi), . . . ,p{xn) 
must be non-zero. The reader should know that Reed-Solomon codes meet the Singleton bound, 
and thus have an optimal trade-off between rate and minimum distance. 



Decoding Algorithms. Decoding the Reed-Solomon code in a channel that introduces e < 
(n — k + l)/2 errors is equivalent to the following problem: 

• Given: distinct elements xi, . . . ,x„ of F^, parameters e and k, with e < {n — k + l)/2, and 
elements yi, . . . ,y„ of F^; 

• Find: a polynomial p of degree at most k — 1 such that 

#i : p{xi) y^yi<e 



Note that, because of the constraint on e and k, the problem has always a unique solution p. A 
polynomial time algorithm for the decoding problem has been known since the early 1960s, following 
Peterson's polynomial time algorithm to decode BCH codes |Pet60j and the reduction of Gorenstein 
and Zierler |GZ61j . who showed that decoding Reed-Solomon codes can be seen as a special case 
of the problem of decoding BCH codes. ^'^ A simple and efficient polynomial time algorithm for 
the decoding problem for Reed-Solomon codes was devised by Berlekamp and Welch |WB86j . We 
describe the Berlekamp- Welch algorithm in the Appendix. 



2.4.3 Reed-Muller Codes 

Reed-Muller codes |Ree54j generalize Reed-Solomon codes by considering multivariate polynomials 
instead of univariate polynomials. That is, we think of each message as specifying a low-degree 
multivariate polynomial, and the encoding of the message is the evaluation of the polynomial at a 
certain set of points. If the evaluation points are suitably chosen, we still have the property that 
a non-zero low-degree polynomial has few roots among these points, and so we can still infer that 
the resulting encoding is an error-correcting code with good minimum distance. 

Let g be a prime power and F be a field of size q. To define a Reed-Muller code we choose 
a subset 5 C F, a degree t < |S| and a parameter m. We will think of an input message as the 
description of an m-variate degree-t polynomial. The message is encoding by specifying the value 
of the polynomial at all the points in S"^. 

We can see that there are up to ("^*) possible monomials in an m-variate polynomial of degree 
at most t. An input message is, therefore, a sequence of k = ("J^*) coefficients. The encoding is 
the evaluation of the polynomial at n = l^l"^ different points. Note that if m = 1 we are back to 
the case of Reed-Solomon codes. Regarding minimum distance, we have the following result, that 
is called the Schwartz-Zippel Lemma (after |Sch80j and Zip79| ) in the computer science literature. 



Lemma 4 If p is a non-zero degree-t polynomial over a field F and S C.F, then 

Prx~s™ [p{x) = 0] < 



^°BCH codes are a class of algebraic error-correcting codes that we will not discuss further in this paper. 
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To compare Reed-Solomon codes and Reed-Muller codes it can be helpful to look at a concrete 
example. Suppose we want to map k field elements into n field elements and we want the minimum 
distance to be at least n/2. 

In the Reed-Solomon code we would choose n to be 2k, and so the minimum distance will be 
k + 1 > n/2. The field size has to be at least 2k. 

In the Reed-Muller code, we also have to choose the parameter m. Suppose we choose m = 2. 
Then we want to choose t and S such that t = \S\/2, k = (*+^), so that k « t^/2 and l^l = 2t. The 
encoding length is [S'p = 4t^ ^ 8k. The field size has to be at least 2t Ki 2\f2k. 

We see that the rate has become worse, that is, the encoding length is bigger, but the field size 
can be smaller, that is, a smaller alphabet is sufficient. 

For larger values of m, we would get an encoding length n = 2^^'^^k and a requirement that 
the field be of size at least 20("*) ■ fcVOM 

What is the extreme case of very small alphabet and very large encoding length? We can 
choose t = 1, so that wc only need \S\ =2, but then we have m = k — \, and, catastrophically, 
n = 2^^^. In this code, we see the input message (cq, ci, . . . , Cfe-i) as representing the affine function 
(xi, . . . , Xk-i) cq + cixi + • • • + Ck-iXk-i. The encoding is the evaluation of such a function at 
all points in {0, l}'^"^ 

We may in fact consider an even more wasteful code in which we interpret the message as a 
linear, instead of affine, function. That is, we think of a message (ci, . . . , c^) as representing the 
function {xi, . . . ,Xk) c^xi + • • • + c^x^, and the encoding is the evaluation of such a function 
at all points in {0,1}'^"^. Such a [A;, 2*^, 2'^'~^]2 encoding is typically called (with some abuse of 
terminology) the Hadamard code. 

2.4.4 Concatenated Codes 

Reed-Solomon and Reed-Muller codes have very good trade-offs between rate and minimum distance 
and, indeed, the Reed-Solomon codes exhibit an optimal trade-off. The drawback of Reed-Solomon 
and Reed-Muller codes is the need for large alphabets: in the Reed-Solomon code the alphabet 
size must be at least as large as the encoding length; in the Reed-Muller codes smaller alphabets 
are possible, but the trade-off between rate and minimum distance worsens when one uses smaller 
alphabets. 

Concatenation is a method that can be used to reduce the alphabet size without compromising 
too much the information rate and the minimum distance. 

Suppose that we have a [N, K, D,]q code Co '■ T^ and a [n,k,d]q code Cj : S'^ ^ S". 

Suppose also that Q = , and let us fix some way to identify elements of F with strings in S'^'. We 
call Co the outer code and Cj the inner code.^^ 

Let X G F'^ be a message and let Co{X) be its encoding. We can think of each coordinate 
of Co{X) as containing a message from E^, and we can apply the encoding Cj() to each such 
message. The end result will be a string in S^". If we start from two different messages X,X\ 
their encodings Ci{X) and Ci{X') will differ in at least D coordinates, and each such coordinate 
will lead at least d coordinates of its second-level encoding to be different. In summary, we have 
described a way to encode a message from F^ as a string in so that any two different encodings 
differ in at least dD coordinates. If we observe that we can identify F^ wih T,^^ , we conclude that 
what we just described is a [nN, kK, dD]q code. 

^^The reason for this terminology will be clear in a minute. 
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Lemma 5 (Concatenation) Suppose we have an explicit construction of a [N, K, D,]q code and 
of a [n,k,d]q, with Q = q'^ , then we also have an explicit construction of a [nN,kK,dD]q code. 

This idea is due to Forney |For66j . 

By concatenating a Reed-Solomon code of rate 1/2 and relative minimum distance 1/2 with 
another Reed-Solomon code with the same rate and relative minimum distance, we can get, say, a 

[n,n/4,n/4]o(iogn) code. 

If we concatenate such a code with a linear code promised by the Gilbert- Varshamov bound, we 
get a [n,i}{n),Q{n)]2 code, and the needed binary code is so small that it can be found efficiently 
by brute force. 

What about decoding? It is easy to see that if we concatenate a [N, K, D, ]q code and a [n, k, d]q 
code, and if the outer code (respectively, the inner code) has a decoder algorithm that can correct 
E errors (respectively, e errors), then it is easy to design a decoding algorithm for the concatenated 
code that corrects up to eE errors. 

Unfortunately this is far from optimal: since e < d/2 and E < D/2, we are able to correct 
< dD /A errors, while we might hope to correct up to dD /2 — 1 errors. 

There is a more sophisticated general decoding algorithm for concatenated codes, due to Forney 
|For66j . which is beyond the scope of this short overview. Forney's algorithm can indeed decode 
up to dD/2 — 1 errors. 

2.4.5 Error Rate for Which Unique Decoding is Possible 

Using concatenation, the codes described so far, algorithms for these codes, and Forney's decoding 
algorithm for concatenated codes, it is possible to show that for every e > there is a rate R 
and, for every large enough n, a [n, i?n, (1/2 — e) • n]2 code that can be encoded and decoded in 
polynomial time; the decoding algorithm is able to correct up to (1/4 — e/2) • n errors. 

This is the largest fraction of errors for which the decoding problem can be solved. Recall that 
the decoding problem is well defined only if the number of errors is less than half the minimum 
distance, and that for a binary code with good rate the minimum distance cannot be more than 
n/2, so that it is not possible to correct more than n/4 errors in a binary code. 

If we use an alphabet of size g, results described in this section lead to the result that for 
every e > there is a rate R and, for every large enough n, a [n, i?n, {1 — 1/q — e) ■ n]q code that 
can be encoded and decoded in polynomial time; the decoding algorithm is able to correct up to 
(1/2 — l/2q — e/2) ■ n errors. This is, again, essentially the best possible fraction of errors for which 
unique decoding is possible 

2.5 List Decoding 

The notion of list decoding, first studied by Elias |Eli57j , allows us to break the barrier of n /4 errors 
for binary codes and n/2 errors for general code. 

If C : S*^ — > S" is a code, a list-decoding algorithm for radius r is an algorithm that given a 
string y E S" finds all the possible messages x ^ T,^ such that the Hamming distance between C{x) 
and y is at most r. If r is less than half the minimum distance of the code, then the algorithm 
will return either an empty list or the unique decoding of y. For very large values of r, the list of 
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possible decodings could be exponentially big in k. The interesting combinatorial question, here, 
is to find codes such that, even for very large values of r, the list is guaranteed to be small. 

Algorithmically, we are interested in producing such lists in polynomial time. 

As we will see, there are binary codes for which efficient list-decoding is possible (with lists 
of constant size) even if the number of errors is of the form (1/2 — e) • n. For codes over larger 
alphabets, even (1 — e) • n errors can be tolerated. 

2.5.1 The Hadamard Code 

The simplest case to analyse is the Hadamard code. 

Lemma 6 Let f : {0, l}'^ {0, 1} be a function and < e < 1/2. Then there are at most l/4e^ 
linear functions I such that /() and l{) agree in at least a 1/2 + e fraction of inputs. 

This means that good list-decoding is possible, at least combinatorially. From the algorithmic 
point of view, we can consider the fact that the input for a decoding algorithm is a string of length 
n = 2^ , and there are only 2^ possible decoding. Therefore, a brute- force decoding algorithm runs 
in polynomial time. 

In Section we will see a probabilistic algorithm that runs in time polynomial in k and 1/e. 

2.5.2 Reed-Solomon Codes 

The list-decoding problem for Reed-Solomon codes can be stated as follows: given n distinct points 
(xi, yi), (x2, 2/2)) • • • , {xn,yn) in and parameters k, t, find a list of all polynomials p such that: 

1. p has degree < k — 1; and 

2. # : p{xi) = yi>t 

With no further constraints on n, k and t, it is not clear that the list of such polynomials is 
small (that is, poly{n, k,t)). In particular, ift = k, there are at least such distinct polynomials 
(pick any of the k points and interpolate). Therefore, we will definitely require that t > k if we 
would like to efficiently list-decode. 

The first polynomial time algorithm for this problem, for t > \/2nk is due to Sudan |Sud97j . 
The error bound was then improved to t > \fnk in |GS99j . which is tight. 

We give a proof of the following theorem in the Appendix. 

Theorem 7 ([Sud97j) Given a list of n points (xi,?/i), . . . , {xn,yn) in '^q, we can efficiently find 
a list of all polynomials p{x) of degree at most k — 1 that pass through at least t of these n points, 
as long as t > 2-\lnk. Furthermore, the list has size at most \/n/k. 

^^We will often work in settings where the size of the list is upper bounded by a constant. A size polynomial in k 
is also acceptable. 

^''Meaning that for smaller values of t the size of the list may be superpolynomial, and so the problem becomes 
intractable even from a combinatorial perspective. 
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2.5.3 Concatenated Codes 



Suppose we have an outer code Co which is a [N,K,D]q code and an inner code Ci which is 
a [n,k,d]q code with = Q, and that we define the concatenated code C which is then a 
[nN, kK, dD]q code. Suppose we have a good Hst-decoding algorithm for both the outer code 
and the inner code: can we derive a hst-decoding algorithm for the concatenated code? 

Here is a very simple idea: apply the inner list-decoding algorithm to each block, and so come 
up with a sequence of lists. Pick a random element from each list, and construct, in this way a 
string of length N, then apply the outer list-decoding algorithm to this list. 

Suppose that the inner decoding algorithm was able to decode from (1 — e)n errors and produce 
a list of size I. 

Suppose also that, overall, we are given a string that agrees with a valid codeword C{x) of C 
in at least 2enN entries. Then there are at least eN blocks in which there are at most (1 — e)n 
errors, and in which the inner decoding algorithm has a decoding consistent with C{x) in the list. 
On average, when we pick randomly from the lists, we create a string that has agreement at least 
eN/l with the outer encoding of x. If the outer list-decoding algorithm is able to tolerate (1 — e/l)N 
errors, then we will find x in the list generated by the outer list-decoding algorithm. 

This argument can be derandomized by observing that we do not need to choose independently 
from each list. We deduce that 

Theorem 8 If Co is a [N,K,D]q code with a (L, (1 — e/l)N) list decoding algorithm, and Ci is a 
[n,k,d\q code with a (1 — e)n) list decoding algorithm, and = Q, then the concatenated code 
C is a [nN,kK,dD]q code and it has a (L, (1 — 2e)nN) list decoding algorithm. 

Basically, if both the outer and the inner code can be list-decoded from an arbitrarily large 
fraction of errors, then so is their concatenation. 

Similarly, one can argue that if the outer code can be list-decoded from an arbitrarily large 
fraction of errors, and the inner code can be list-decoded from a fraction of errors arbitrarily close 
to 1/2, then their concatenation can be list-decoded from a fraction of errors arbitrarily close to 
1/2. 

More sophisticated algorithms for list-decoding concatenated codes are in |GSOnbj . 
2.5.4 Error Rate for Which List-Decoding is Possible 

Using the results that we mentioned above we can prove that for every e and for every k there is 
a polynomial time encodable code C : {0, l}'^ —>■ {0, 1}" that is (L, (1/2 — e) • n) list decodable in 
polynomial time, where n = poly(fc, 1/e) and L = poly(l/e). Thus, a meaningful (and useful) form 
of error-correction is possible with a binary code even if the number of errors is close to n/2. 

By the way, considerably better results are possible, and, in particular, it is possible to have 
n = 0{k ■ poly(l/e)), so that the rate is constant for fixed e. It is also possible to implement the 
list-decoder in nearly-linear or even linear time. (For some recent results in these directions, see 
e.g. |(;tSnnhl KHM RTTfM IHTnH] and the references therein.) 

For larger alphabets, it is similarly possible to have list-decodable codes that tolerate a fraction 
of errors close to (1 — 1/g) • n. 
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3 Sublinear Time Unique Decoding 



In this section we discuss error-correction algorithms that run in sub-Unear time, and their relations 
to private information retrieval (a type of cryptographic protocol) and to average-case complexity, 
as well as to the notions of self-correction, instance-hiding and random-self-reduction. 

3.1 Locally Decodable Codes 

Let C : T,^ ^ T,"^ be an error correcting code. The results and the algorithms described so far deal 
with the following setting: for some message x e T,'', the codeword C{x) G has been "sent," 
however a corrupted string y G S'* has been "received," which differs from C{x) in a bounded 
number of entries; our goal is to reconstruct x, possibly in time polynomial or even linear in n. In 
this section we deal with algorithms whose running time is sublinear in n and k, or even a constant 
independent of n. An algorithm with a such a fast running time cannot possibly reconstruct the 
entire message, since it does even have to time to write it down. Instead, we will look for algorithms 
that given an index i and a corrupted version of C{x) will be able to compute just the entry Xi. 
Such codes are called locally decodable error-correcting codes. 

Such codes could be useful in the setting of information storage: a very large amount of in- 
formation (for example several songs) could be encoded as a single codeword and then stored in 
a medium that is subject to become partially corrupted over time (for example a CD, which is 
subject to scratches). When a particular piece of information (for example, a song) is needed, 
then the decoding algorithm will not decode the entire content of the medium, but only the part 
that is needed. Hopefully, then, the decoding time will be proportional only to the length of the 
desired fragment of information, whereas the whole medium will be robust against a number of 
errors proportional to the size of the entire storage. 

As we will see, however, even the best known locally decodable codes have very poor rate, and 
this is conjectured to be an inherent problem, and none of them seem to have applications to data 
transmissions and data storage. In complexity theory and in cryptography, on the other hand, they 
have several applications, as we will see. 

We start with a formal definition. We make the following convention, that we maintain through 
the paper: whenever we refer to an oracle algorithm, we assume that the algorithm makes non- 
adaptive queries. 

Definition 1 (Locally Decodable Code) A code C : S'^ — >■ is {q,S,p) -locally decodable if 
there is a probabilistic oracle algorithm of query complexity at most q such that for every message 
X G Ti^, index i E {1, . . . , k}, and string y such that d{y, C{x)) < Sn we have 

Pr[A2'(i) =Xi]>p . 

The probability is taken over the internal coin tosses of A. 

^*In practice, the information on music and data CD and on DVD is encoded in a different way. The data is split in 
relatively small blocks, each block is encoded with a variant of the Reed-Solomon code, and then the encoding of each 
block is scattered in non-consecutive locations on the disk. This system has very poor resistance against worst-case 
errors, because one can destroy a block of the original data by damaging its encoding, which is a very small fraction 
of the overall encoding. On the other hand, this system performs well against a small number of "burst" errors, in 
which contiguous locations are damaged. The latter type of errors is a good model for the damage suffered by CDs 
and DVDs 
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In this setting, a message x made of k elements of an alphabet S is encoded as n elements of 
alphabet F. After at most 6n errors occur, we are interested in reconstructing an entry Xi of x. 
Algorithm A performs such a task with probability at least p while looking at only q entries of the 
corrupted encoding of x. 

We will mostly be interested in the case in which g is a small constant, S = {0, 1}, and 

r = {o,i}*. 

3.2 Relation to Private Information Retrieval 

A private information retrieval scheme is a system in which a "database" x, a fe-bit string, is known 
to q independent "servers" Si,... ,Sq. A "user" is interested in a bit Xj of x and wants to retrieve 
it with a single round of communication, in such a manner that no server can tell the value i. More 
formally, we require that the distribution of the query sent to each server be independent of i. A 
weaker requirement is that the distributions corresponding to various i be statistically close in the 
view of each server. 

Definition 2 (One- Round Private Information Retrieval) ^ (1 — S)-secure q-server one- 
round private information retrieval system with recovery probability p for k-bits database is a col- 
lection of q + 2 procedures {Q, Si, . . . , Sg, R) that work as follows. 

Fix a string x G {0, 1}^ and an index i G [k]. On input an index i and random coins, the query 
procedure Q computes q queries ji, . . . ,jq G [n]. On input the query jt and the string x G {0, l}'^, 
the t-th server produces (deterministically) the answer at = St{x,jt) G {0, 1}'. Given i, the recovery 
procedure R computes R{i, ai, . . . , Oq). 

We require that the following two conditions hold: 

Recovery For every x G {0, 1}'^ and i G [k], there is a probability at least p (over the coin tosses 
of Q) that the final output of R equals Xi. 

Privacy For every x G {0,1}'^, every i,j G [k] and every t G [q], if we sample at random 
(ai, . . . , aq) ~ Q{i) and {a'l, . . . , a'q) ~ Q{j), then the distribution of at and a't have statistical 
distance at most 5. 

We call I the answer size of the PIR system, and log2 n the query length. The communication 
complexity of the system is q- {I -\- log n) . 

The definition is long and technical, but hopefully it is not too hard to follow it if one has clearly 
in mind the intuitive notion that the definition is trying to capture. All known constructions have 
perfect recovery probability p = 1 and they are 1-secure (that is, queries made to the same server 
for two different indices are identically distributed). 

A superficial connection between PIR and LDCs of constant query complexity is that all known 
constructions of both object follow from constructions of a more general object, that we define 
below. 

Definition 3 (Perfectly Smooth Decoder) A perfectly smooth decoder for a code C : S*^ — ^ F" 
is a probabilistic oracle algorithm A such that for every i E k and every x G {0, 1}*^ we have 

Pr[A^(^)(i) = Xi] = 1 

Furthermore, if q is the query complexity of A, then for every j G [q] and every i G [k], the 
distribution of the j-th oracle query made by A'~^^^\i) is uniform over [n]. 
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Setting 


Construction 

of perfectly smooth codes 


Lower Bounds 
for all LDCs 


2 queries, Boolean encoding 
2 queries, encoding using {0, 1}' 
2 queries, encoding using {0, 


n = / 

Tl — t * Z ' 

n = 20(fc'/') \CXiK^9H\ 


n = I ^ ' (KaWUoj 

Tl — z ^ ' ' [jva vv Utjj 

n = n(k^/^^ iKTnnj 


3 queries. Boolean encoding 


n = 2^ BIOT 


= 17(n2) KdW03 


3 queries, encoding using {0, 1}' 


n = 2^^ BIOlJ 


k = ^i(nll)^■^^ |KTnn| 


q queries. Boolean encoding 




k = iNn'^/w-^niKdWn3l 



Table 1: Main known results on Locally Decodable Codes with decoders of constant query com- 
plexity. 



Suppose we have a perfectly smooth decoder for a code C. Then it is easy to see that for every 
5 <l/q the decoder shows that C is also a (g, 5,1 — 6q) locally decodable code. If y is a string that 
is 5-close to a codeword C{x), and let i be any index; then there is at least a 1 — 5q fraction of 
the coin tosses of A such that the view, and outcome, of Ay{i) and A'-'^^\i) are the same, and so 
Ay{i) has at least a probability 1 — 5q of correctly computing Xj. If q is not a constant, then this 
observation does not give us a LDC that is able to correct a constant fraction of errors, so only 
perfectly smooth decoder of constant query complexity give good LDCs. 

Regarding PIR, consider the following approach. The user simulates A^'\i) and finds out the 
queries ji, . . . ,jq that A would have made.^^ Then it sends each query to a different server. Given 
jt, the t-th server computes C(x) and returns that jt-th entry of C{x). Given these answers, the 
user completes the simulation of A'-^^^\i) and computes Xj. 

This PIR system is 1-secure, has perfect recovery, the query size is log n and the answer size is 

iog|r|. 

The main known results for constant values of q are shown in Table ^ 

For q = 2, the Hadamard code gives a perfectly smooth code with exponential encoding length. 
The exponential blow-up is necessary for all 2-query binary LDCs |KdWn3| . Even for g = 2 and 
large alphabets, tight results are not known. Chor et al. |(XtKS98] show that one can achieve 
encoding length roughly 2^^^ ^ with an alphabet of size roughly 2^^ ^ , which corresponds to a PIR 
with communication complexity 0{k^^^). For such alphabet sizes, the only applicable lower bounds 
are in |KT00j . and they are barely super-linear. 

For q = 3, even for binary alphabet there is a huge gap between a polynomial lower bound 
and a sligthly sub-exponential upper bound. For larger q, the best known Boolean constructions 
achieve encoding length just slightly better than 2^^ while the lower bound is roug hly fei+V^. 

Below we give some more references to constructions and lower bounds. 

3.3 Local Decoders with Constant Query Complexity 

We give an overview of some simple constructions of perfectly smooth codes. We will not get 
into details of the best known construction, the one by Beimel et al |BIKR 02] . which is somewhat 
complicated. 

^^Recall that all the oracle algorithms in this paper make non-adaptive queries. 
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3.3.1 Hadamard Code 



Let us start by considering the Hadamard code H : {0, 1}^ {0, 1}^ . In the Hadamard code, the 
encoding H{x) has one entry for every vector a G {0, 1}*^, and the content of the a-th location of 
H{x)a is the bit a • x = Y2j ^j^j (mod 2). Suppose we have oracle access to H{x) and we want to 
reconstruct Xj for some index i. Clearly Xj = • x, where ej is a vector with a 1 in the i-th position 
and Os elsewhere, however we cannot just read the e^-th position of H{x), because a smooth decoder 
must make uniformly distributed queries. Instead, we use the following trick: we pick at random 
a vector a G {0, 1}'^ and we read the entries H{x)a and H{x)a®ei, which will return, respectively, 
a ■ X and (a © e^) • x. By linearity, the xor of these two values will be ((a © a © Cj) • x) = Cj • x = e,. 
This idea dates back to BL E,93j . 

If the smooth decoder is applied to a string y that is at relative distance 5 <\/A from a valid 
codeword C(x), then, for every i, the decoder succeeds with probability at least 1 — 2(5 in computing 
Xj. If the relative distance between C{x) and y is 1/4 — e, then the success probability of the decoder 
is 1/2 + 2e, which can be amplified to, say, 1 — 1/4/c by repeating the decoding independently for 
0(e~^ - log k) times and then taking the majority value. If we do this for every i, we get an algorithm 
that runs in time 0(e~^A; log A;) and computes x with probability at least 3/4. 

3.3.2 Polynomial Codes 

A similar idea (dating back to |BF9nj ) can be used to get a perfectly smooth decoder for a variant 
of the Reed-Muller code. The variant will have the property of being a systematic code, meaning 
that the message occurs as a substring of the encoding. 

To make the Reed-Muller code systematic, we proceed as follows. We have a field F, a subset 
S* C F, a degree parameter t and a number of variables m. Besides, we also choose another subset 
A C S" of size \A\ = t/m. In the standard Reed-Muller code, we would just consider all possible 
("m*) coefficients that an m-variate degree-t polynomial can have, and we encode a message of 
length ("^*) by interpreting each message coordinate as a coefficient. This time, instead, we have 
a (shorter) message of length 1^1™ = (t/m)™", which we think of as a function / : A"^ F; using 
interpolatio, we find a polynomial p of degree < t that agrees with / on A^. The evaluation of this 
polynomial on S"" will be our encoding. 

This is not much worse than the standard encoding, and, for example, it remains true that 
the rate is constant for constant m. The advantage, is that the original message is a subset of 
the encoding. (The evaluation of the polynomial at A^ .) A code with such a property is called a 
systematic code. 

In the following discussion we assume S = ¥. 

Consider now the following algorithm that, given oracle access to a polynomial p : F™ — > F of 
degree t, and given a, computes p{a) by making random oracle queries. 

1. Choose a random b £ F™ and consider the line l{z) = a + zb. 

2. Query p at locations /(I), l{2), . . . ,l{t + 1). 

3. Compute the unique univariate degree t polynomial q such that q{z) = p{l{z)) for z = 
l,2,...,t + l. Return g(0). 

Where, in the algorithm description, we used "2," . . . "t + 1" as the names of t elements of F 
distinct from 1 and 0. 
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The algorithm is based on the following idea: if p{xi, . . . ,Xm) is a multivariate polynomial of 
degree t, and l(z) = (ai + zbi, . . . ,am + zbm) is a line, then P(l(z)) is a univariate polynomial in 
z of degree at most t. As such, it can be reconstructed if we are given its values at i + 1 different 
points. 

So we pick at random a line l{) that passes through the point a (at which we want to evaluate 
pO), we read the value of at t + 1 different points, we reconstruct the univariate polynomial 
p{l{z)), and finally, by evaluating it at 0, we compute p{l{0)) = p{a). The second observation to 
make, which completes the analysis, is that, on a random line of the form l{z) = a + zb every point, 
except 1(0) = a, is uniformly distributed, and so the procedure is making uniform queries. 

A disadvantage of this proccdTire is that it can never lead to a constant query complexity, 
because the degree can never be a constant in the version of the Reed-MuUer code we described 
before. We now describe another variant in which we encode a message as a constant-degree 
polynomial, so that the above algorithm has constant query complexity. 

Suppose that k < (™). We index each message entry by a unique d element subset S of [m]. 
Given a message x = {xs)s<z[m]j we define the polynomial 

Px{zi, ...,Zm)= ^ JJ 

S:\S\=d jeS 

over the field F, where |F| < 2t. The encoding C{x) is obtained by evaluating p^ over all points in 
F™. 

Since Px has total degree t, the decoding algorithm described above also works for the code C. 
Moreover, C is systematic: We can write xs = Px{es), where es = {esi, ■ ■ ■ , esm) £ IF'" and 

r 1 iijes, 

\ otherwise. 

The code C can encode messages of length up to (™) and has codeword length n = |Fp, 
provided |F| > t. If we take, say, |F| < 2t, this yields n = 2C'(*i°g*'='^*), where t = q-l. 

3.4 Local Decoder with Polylogarithmic Complexity 

If we want decoders with constant query complexity, then the best known constructions are super- 
polynomial. 

By adjusting the parameters in the Reed-Muller-like constructions of the previous sections, it is 
however possible to have perfectly smooth codes with n = poly(A;), or even n = 0{k), and sub-linear 
complexity. The problem is that a perfectly smooth decoder of super-constant query complexity is 
not necessarily a good local decoder able to correct a large number of errors. 

There is, however, a way to get a good local decoder even when the degree of the polynomial 
(and so the query complexity of the decoding algorithm) are high. 

We are considering the following problem: we have a message which we view as a function 
/ : — F, for some subset A C F. We encode the message by finding a polynomial p of degree t 
that agrees with / on all of A^, and the encoding is the evaluation of p at all points of F"*. 

Now we have oracle access to a corrupted encoding g : F™ F, which disagress from p in 
a 6 fraction of elements of F™. We are given an entry a G and we would like to compute 
/(a) = p{a) with high probability by using oracle access to g. 
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As before, we pick a random b £ F"^ and consider the random line l{z) = a + bz, z £ ¥, and 
we would like to find the univariate polynomial q{z) = p{l{z)), because if we find q{) then we have 
also found /(a) = p{a) = q{0). Instead of just reading t + 1 points of p{l{z)) and interpolating (we 
don't have access to we read g{l{z)) for, say, "it values of z, and then we apply the Berlekamp- 
Welch algorithm to find the degree-t polynomial qQ that agrees the most with these values. The 
Berlekamp- Welch algorithm will indeed find g() provided that there are fewer than t places among 
those we read where g{l{)) differs from p{l{))- Recall that each of these places is a point on a line, 
and so it is random, so that on average we expect to encounter at most 36t errors, which is less 
than t/4, say, if 5 < 1/12. If so, then a Markov argument shows that with probability at least 3/4 
we indeed find fewer than t errors and so we correctly find q{). 

This is far from optimal, but it gives an idea of how good LDCs with super-constant query 
complexity work. 

By playing with the parameters, it is possible to construct, for every e, binary LDCs C : 
{0, 1}*^ — > {0, 1}"" where n = 0{k) and the decoding algorithm has query complexity n''. The best 
lower bound on query complexity for LDCs of linear encoding length is logarithmic, and so we have 
again an exponential gap between constructions and lower bounds. In this case too my guess is 
that the lower bound should be improved. 

Open Question 1 Prove that there cannot be a binary (q, 6,3/'i)-LDC C : {0, l}'^ {0, 1}" where 
q = O(logn) and n = 0{k). 

3.5 Application to Average-Case Complexity 

Techniques sketched in the previous section show the existence of LDCs C : {0, l}'^ {0, 1}" with 
n = polyfc, with a decoder of poly log k complexity, and with a polynomial time encoding algorithm. 
Let us now see how this applies to average-case complexity. 

Let L be an EXP-complete problem, and for an input length t let us consider the restriction of 
L to inputs of length L. We can see L restricted to some inputs as being a binary string of length 
2*. (The truth-table of a Boolean function with t-bits input.) 

Let us encode this string using our code C: we get a string of length 2*^^*) = 2* , and let us 
think of this string as defining a new problem L' on inputs of length t' . If L was in EXP, then so is 
V . In fact, if L was EXP-complete, then so is L' . In addition, if we have an algorithm for L' that 
is good on average, the algorithm together with the local decoder give a probabilistic algorithm for 
L that works on all input, and EXP C BPP. 

This argument shows that if every problem in EXP can be solved well on average then EXP C 
BPP. 

This argument raises many questions, such as the following: 

• Can the same argument work for PSPACE? The answer is yes, provided we can construct a 
good LDC in logarithmic space. This is in fact easy to do, and so the same argument does 
indeed apply to PSPACE. 

• Can the same argument work for NP? Viola |Vion3j shows that the LDC decoding does not 
work for NP, and, more generally, for any problem in polynomial hierarchy. Bogdanov and 
Trevisan |BTn3) show that worst-case to average-case reductions in NP are problematic even 
if they are not based on LDCs. 
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3.6 Lower Bounds 



In this section we give an overview of lower bounds for focally decodable codes. 

The fohowing notion is a useful relaxation of the notion of perfectly smooth code. 

Definition 4 (Smooth Decoder) A {q,c,p)-smooth decoder for a code C : T,^ ^ T" is a proba- 
bilistic oracle algorithm A of query complexity q such that for every i G k and every x G {0, l}'^ we 
have 

Fr[A^^-^\i) = Xi] > p 

Furthermore, for every t £ [q], every i £ [k], and every j G [n], the probability that the t-th query of 
is j is at most c/n. A code having a {q,c,p) -smooth decoder is also called a {q,c,p) -smooth 

code. 

A perfectly smooth decoder of query complexity g is a {q, 1, l)-smooth decoder. 

The definition of smooth decoder allows the algorithm to use non-uniform distributions of 
queries, however no query has a probability of being made which is much higher than with respect 
to the uniform distribution. 

The following result shows that every locally decodable code has a smooth decoder. 

Lemma 9 (jKTOOj) If C : T,^ ^ is a {q, 6,p)-LDC, then it is also a {q, 1/ 6, p) -smooth code. 

Goldreich et al. |(;KSTn2j also show that every private information retrieval system implies a 
smooth code with related parameters. 

These results show that it is enough to prove lower bounds for smooth codes, a strategy followed 
exphcitely in IKTnnilHKTn^ and imphcitely in |KdWn3j . 

Another step is the following. 

Lemma 10 // C : — > F" is a {q,c, 1/|S| + e)-smooth code, then for every index i G [k] there is 
a collection Mj of Q{ne/c) disjoint q-tuples {ji, . . . ,jg) such that for a random x it is possible to 
predict with probability bounded away form half the entry Xi given the entries C{x)jj^, . . . , C{x)j^. 

If C : F'^ — > F" is linear, then the conclusion of the above Lemma has a simpler interpretation. 
If we write C{x) = {ci ■ x, . . . ,Cn ■ x), then for every i there is a a collection Mi of 0(ne/c) disjoint 
g-tuples (ji, . . . ,jg) such that the vector Cj is in the span of Cj^, . . . ,Cj^. 

A converse can also be proved, that is, if the above collections exist then the code is smooth 
and also locally decodable, which means that there is no loss in generality in trying to prove a lower 
bound using this combinatorial structure. The exponential lower bound in |iGKST02) follows then 
from the following combinatorial result. 

Lemma 11 Let ai, . . . , G {0, l}'^ be (not necessarily distinct) vectors and suppose that for every 
i G [k] there is a set Mi of 5n disjoint pairs {j,j') such that Clj ® Cljf — Ci. 
Then n > 2"^^^. 

A proof of the following combinatorial result would give the first super-polynomial lower bound 
for the case of linear codes decodable with three queries. (So far, super-polynomial lower bounds 
have been proved only for the case of two queries. Upper bounds are super-polynomials for all 
constant query complexities.) 
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Open Question 2 Suppose that ai, . . . , a„ is a sequence of elements of {0, 1}*^ such that for every 
i G [k] there is a set of Q,{n) disjoint triples {ji,j2,j3) such that e,-. Prove that n 

must be super-polynomial in k. 

This is the natural next lower bound question to adress, and it is much harder than it looks. 
The best lower bounds for LDCs are currently those from |KdW03| . proved using quantum 
information theory. 

The use of quantum arguments in a "classical" problem is surprising, and it would be interesting 
to see a purely combinatorial proof of the same results. 

Open Question 3 Re-prove the results of IKdWO,'^ without using quantum information theory. 



3.7 Notes and References 

The notion of locally decodable codes was implicitely discussed in various places in the early 
1990s, most notably in |MKLS91 1 IS^Td92] . and it was explicitely defined in [KTOnj . where smooth 
codes are also defined. The notion of private information retrieval was introduced by Chor and 
others |(XtKS98] . Locally decodable codes, private information retrieval and smooth codes can be 
seen as the combinatorial analogs of notions that had been studied in complexity theory in the late 
1980s and early 1990s. In particular, one can see the decoding procedure of a locally decodable 
codes as a combinatorial version of a self-corrector |BK89| |Lip901 IBLR,93j . a perfectly smooth 
decoder is analogous to a random-self-reduction, a notion explicitely defined in |AFK89[ IFKN90] . 
and a private information retrieval system is analogous to an "instance-hiding" scheme |AFK89j . 

The perfectly smooth decoder of Hadamard Codes is due to Blum and others fBLR93^ and the 
Reed-Muller codes is due to Beaver and Feigenbaum IBF90I. There has been a substantial amount 
of work devoted to the construction of efficient private information retrieval schemes, leading to 
the sofisticated construction of Beimel and others |BIKR,02] . Work by Ambainis, Beimel, Ishai and 
Kushilevitz |Amb97l IIK991 IBIOlj is particularly notable. The Reed-Muller decoder of Section 13.41 
is due to Gemmell and others |GLR,^91 . It should be noted that there are other models and 



questions about private informational retrieval that we did not discuss in this section. In particular, 
we did not discuss the notion of computationally secure private information retrieval, in which the 
distribution of queries are just computationally indistinguishable, the notion of symmetric private 
information retrieval, in which the user does not get any other information about x except x,, and 
other algorithmic problems, such as issues of efficiency for the server. 

Lower bounds for private information retrieval were first proved by Mann Man98 . Lower 
bounds for smooth codes, which imply lower bounds for locally decodable codes and for private 
informational retrieval, are proved in |KT001 IHkF^TCM I()ba021 IKdW03j . 



4 Sublinear Time List Decoding 

Sub- linear time list-decoding is perhaps a less intuitive notion to define than sub-linear time unique 
decoding. 

As in the past section, we have some code C : {0, 1}'^ {0, 1}" and we have oracle access to 
a string y £ {0, 1}" that is within some distance d from a codeword, and we want to find all the 
messages x such that C{x) and y are within distance d. 
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Since the decoder has to run in time o{k), it cannot output the full list, but rather it will 
output a list of "compressed" representations of messages. What will a compressed representation 
of a message x be like: it will be the code of an efficient probabilistic oracle algorithm that given i 
and oracle access to y gives Xi in output with high probability. 

Another way to look at this setting is to think of the decoding algorithm as outputting a list of 
"local decoders" as previously defined, one for every message in the list. 

This model is discussed in detail in |SudOOj . along with the description of several applications. 
We will quickly review such applications, and refer the reader to [S-MdOQl for more details, and we 
will devote more space to results by Akavia et al |AGS03j . that postdate |SudOOj . 

4.1 Formal Definition of Local List-Decoder 

Let us fix a model of computation to describe oracle algorithms. 

Definition 5 (Local List-Decoding) A probabilistic oracle algorithm A is a local list-decoder 
for a code C : Ti^ ^ for radius r if, for every string y G S", A^ outputs a list of probabilistic 
oracle algorithms Di, . . . , Dl such that for every string x such that dH(C{x), y) < rn the following 
happens with probability at least 3/4 over the random choices of A^ : 

3j G [L\Ni G [/c].Pr[D|(i) = Xi] > 3/4 . 
The probability in the above expression is taken over the random choices of D^-. 

We note that we are interested in the complexity both of A and of Di, . . . , Di, and, ideally, 
all these complexities would be a constant. Interestingly, constant complexity (at least, constant 
oracle query complexity) is achievable for the Hadamard code when r = 1/2 + e, for constant e. 

In applications to average-case complexity and cryptography, running time poly-logarithmic in 
n is also acceptable, and we will sketch a proof (due to Sudan and others |STV01j of the fact that 
a veriant of the Reed-Muller codes have local list decoder of polylogarithmic complexity in k and 
n and polynomial in 1/r. 

In the case of the Hadamard code, for which n = 2^, poly-logarithmic complexity in n is 
equivalent to polynomial complexity in k, and a local list-decoder has enough time to explicitely 
output the list of codewords. Goldreich and Levin |GL89j present a list-decoder for the Hadamard 
code that runs in time polynomial in k and in 1/e when r = 1/2 -|- e. The local decoder of 
constant (depending on e) query complexity can be derived from an alternative proof of the result 
of Goldreich and Levin attributed to Rackoff. We note that in, most applications of the Goldreich- 
Levin result, 1/e is either polynomially related to k, or it is even superpolynomial in k, a local 
decoder of complexity polynomial in 1/e and independent of k is not significantly more efficient 
than the original Goldreich-Levin algorithm. 

4.2 Local List Decoders for the Hadamard Code and for Polynomial Codes 

We state the some known results about local list-decoding of error-correcting codes. We will prove 
the Goldreich-Levin result in Section [4. 41 

Theorem 12 ([GL89J) Let H : {0, 1}'' ^ {0, 1}^ be the Hadamard code. There is an algorithm 

that, given oracle access to y ^ {0, 1} and a parameter e, runs in poly{k/e) time and outputs. 
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with high probability, a list of all the strings x G {0, l}*^ such the Hamming distance between y and 
C{x) is at most 1/2 — e. 

The theorem has a stronger form m which the algorithm runs in poly(l/e) time, independent of e, 
and outputs a hst of local decoders, each running in poly(l/e) time. 
The next result that we state is for polynomial encodings. 

Let F be a field, m be an integer, A C F be a subset. Consider the polynomial encoding 
C : fI^I'" ^ ¥\^r in which we think of a message as a function / : A"^ F, and its encoding is 
obtained by finding, using interpolation, a polynomial p of degree < m\A\ that agrees with / on 
j4™, and then evaluating p on all the points of F™. 

Theorem 13 ( |STVOT] ^ There is a constant c such that the following happens. Let C : fI"^!"* 
F'^I be the polynomial encoding described above. There is an algorithm that, given oracle access 
to a string g E f'^'™ and a parameter e > c - Y^|74|m/|F|, runs in time poly{e~^ ,m, |5|,log |F|) and 
outputs a list of local decoders such that, for every f £ F'*^! such that the relative distance between 
P{f) and g is less than 1 — e, there is a local decoder in the list that computes f given oracle access 
to g. 

In simpler terms. Theorem states that there for every e > there is an efficiently computable 
error correcting code C : F*^ ^ F" such that n = poly(e~^, k) and such that local list decoding can 
be performed in poly(e~^, log(n)) time even after (1 — e)n errors occurred. 

The code can be concatenated with a binary code to obtain a binary locally list-decodable code. 

Theorem 14 ([ STVOT] ^ For very e and k there is a polynomial time encodable code C : {0, l}'^ 
{0,1}" with n = poly{k, e~^) and a local list decoding algorithm that runs in time poly {€~^ , log n) 
and is able to correct up to (1/2 — e)n errors. 

4.3 Applications to Average Case Complexity 

Average-case complexity is an ideal application for sub-linear time list decoder, since one can deal 
with average-case algorithms that make a very large fraction of errors. On the other hand, coding- 
theoretic methods can prove average-case complexity results only for classes like PSPACE and 
EXP, while one is typically interested in the average-case complexity of problems within NP. 

A strong motivation to the study of average-case complexity in EXP came from a result by 
Nisan and Wigderson |NW94j . Before stating the result, let us introduce the following notion: a 
decision problem on inputs of length n is (^(n), (5(n))-average case hard if every circuit C of size 
< S{n) fails to solve the problem on at least a 6{n) fraction of inputs of length n. 

Theorem 15 (Nisan- Wigderson) Suppose that there is a problem in DTIME(2'^(")) that is 
(2^("), l/2^(")) -average case hard. Then P = BPP. 

The Nisan- Wigderson result shows an extremely strong conclusion from a very strong assump- 
tion. The postulated average-case complexity is very high, and the assumption would sound more 
natural if it referred to standard (worst-case) circuit complexity. 

Theorem 16 (ImpagHazzo Wigderson |IW97| ) Suppose that there is a problem L in 
DTIME(2'^(")) that has circuit complexity 2^("); then there is also a problem L' in DTIME(2'^(")) 
that is (2^("), l/2^(")) -average case hard. (And P = BFF.) 
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The Impagliazzo-Wigderson proof was very complicated, and their construction included several 
parts, one of them being, essentially, a Reed-Muller encoding of the starting problem. 

The code and the decoder of Theorem 1141 give a simpler proof of Theorem 1161 as follows. Let 
L be problem in the assumption of Theorem 1161 For everyy re, let us consider the binary string 
X of length K = 2^ that describes the "truth-table" of L for inputs of length re. Let us compute 
C(x), which is of length N = poly(X) = 2'^"' for some constant c. We define L' to be the problem 
whose truth-table, on inputs of length ere, is C{x). Having a circuit that computes L' correctly 
on a 1/2 -|- e fraction of inputs is essentially the same as having oracle access to a string of length 
N that is within distance 1/2 — e from C{x). Applying the linear decoder, we can find a list of 
programs such that one of them computes x on any entry of our choice (that is, it solves L on any 
input of length re of our choice) probabilistically in time polynomial in 1/e and in re. Since our goal 
is to build a circuit, we can non- uniformly pick the correct program from the list, and convert the 
probabilistic algorithm into a deterministic circuit. If we had a circuit of size 2'^'^"' that computed 
L' on a 1/2 -|- 1/2^'^" fraction of inputs of length ere, and if we picked e appropriately, we end up 
constructing a circuit of size 2^^^^^ that solves L on all inputs of length re, which contradicts the 
assumption of the Theorem if 6 is small enough. 

4.4 Proof of the Goldreich-Levin Result 

In this discuss the Goldreich-Levin list-decoding algorithm for the Hadamard code. It will be 
convenient to think of the codewords of the Hadamard code as functions: 

Definition 6 Let a G {0, l}'^, and define La : {0, l}'^' {0, 1} to be the function La{x) = a ■ x. 
Then, La{-) is the Hadamard encoding of a. 

We may then state the main result from |GL89j as follows. 

Theorem 17 r| GL89 p There is a (probabilistic) algorithm that given oracle access to a function 
g : {0, l}'^ {0, 1} and a parameter e > runs in time O (^/clog A;) and outputs a list of O (^) 
elements of {0, 1}*^ such that: for every a for which La and g agree on > ^ + e fraction of inputs, 
the probability that a is in the list is at least 3/4. 

We recall from Section [3. 3. II that if we are given an oracle that agrees with a linear function La on, 
say, a 7/8 fraction of the inputs, then it is possible to compute a in time 0(A:logA;). Our goal will 
be to able to simulate an oracle that has good agreement with La by "guessing" the value of La at 
a few points. 

We first choose t random points xi . . .xt G {0, 1}" where t = 0(l/e^). For the moment, let us 
suppose that we have "magically" obtained the values La{xi), . . . , La{xk)- Then define g'{z) as the 
majority value of: 

La{xj) ® g{z ® Xj) j = l,2,...,t (1) 

Since for each j we obtain La{z) with probability at least ^ + e, hy choosing t = 0(1/6^) we can 
ensure that 

31 

Pr.,.i,...,., [g\z)=Laiz)] > -. (2) 

from which it follows that 

Pr.,...,., [Pr, [g\z) = La{z)] > 7/8] > | (3) 
Consider the following algorithm. 
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Algorithm GL-First- Attempt: 

pick xi,... ,xt G {0, 1}'' where t = 0(l/e^) 

for all 6i,...,6t G {0,1} 

define ^^{z) as majority of: bj © g{z + Xj) 

apply the algorithm of Section [3.3. II to uniquely decode (7^^ 

add result to list 



The idea behind this program is that we do not in fact know the values La{xj), so we guess 
all possibilities by considering all choices for the bits bj. For each a such that La and g agree on 
more than half of their domain, we will eventually choose hi = La{xj) for all j and then, with high 
probability, recover a via the algorithm of Section 1333 The obvious problem with this algorithm is 
that its running time is exponential mt = 0(1/ e^) and the resulting list may also be exponentially 
larger than the 0(l/e'^) bound promised by Theorem 1171 

To overcome these problems, consider the following similar algorithm. 

Algorithm GL: 

choose xi, . . . ,xi £ {0, l}'^ where I = O (log(l/e)) 
for all bi,...,bie {0,1} 

define g'^_^ (z) as majority over all nonempty 5 C {1, ...,/} of: (©j65^i) ® 9 (^z + J2jes ^i) 
apply the algorithm of Section I!l3. II to uniquely decode (7^^ 
add result to list 



Let us now see why this algorithm works. First we define, for any nonempty S C {1,...,/}, 
xs = X]je5 ^j- Then, since xi, . . . ,xi G {0, 1}*^ are random, it follows that for any S T, xs and 
XT are independent and uniformly distributed. Now consider any a such that La{x) and g{x) agree 
on i + e of the values in their domain. Then for the choice of {bj} where bj = La{xj) for all j, we 
have that 

jeS 

and, with probability ^ + e, 

3 Z © ^ =g{zQxs) = La{z © Xs) = La{z) © La{xs) 

so combining the above results yields 

06j ©5 U© = -^'^(^) 

i6S \ jes J 

with probability ^ + £• 

Note the following simple lemma whose proof we omit: 
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Lemma 18 Let Ri, . . . ,Rt be a set of pairwise independent — 1 random variables, each of which 
is 1 with probability at least ^ + e. Then Pr[^j Ri > t/2] > 1 — O(^). 

Lemma [THl allows us to upper-bound the probability that the majority operation used to compute 
g' gives the wrong answer. Combining this with our earlier observation that the {xs} are pairwise 
independent, we see that choosing I = 2 log l/e + 0(l) suffices to ensure that (7^^ ^^(z) = La{z) with 
probability 1 — c > | + e for any constant c > 0. Thus we can use the algorithm of Section l3.3.1l to 
obtain a with high probability. Choosing / as above ensures that the list generated is of length at 
most 2^ = 0(l/e^) and the running time is then 0(A;logA;/e^), due to the 0(l/e^) iterations of the 
algorithm of Section 13.3.11 This completes the proof of the Goldreich-Levin theorem. 

4.5 Applications to "Hard-Core Predicates" 

4.5.1 Hard-core Predicates for One-way Permutations 

Intuitively, a function / : {0, l}'^ {0, l}'^ is a one-way permutation if it is easy to compute but 
hard on average to invert. Note that / is a permutation on the set {0, l}'^, and not a function that 
permutes the bits of its input string. Formally, 

Definition 7 A permutation f : {0, 1}'^ — > {0, 1}'^ is {s, e)-one-way if 

• there is an efficient algorithm that on input x outputs f{x) 

• for all circuits D : {0, 1}'^ {0, 1}*^ of size s, and for all sufficiently large k, 

Pr[L>(/(x)) = x] < e 

X 

Often in applications such as the construction of pseudorandom generators in cryptography, 
we are interested in a hard-on-average decision problem. This motivates the notion of a hard-core 
predicate: 

Definition 8 A function B : {0, 1}'= {0, 1} is a {s, e) hard-core predicate for a permutation f if 
it is hard on average to compute B{x) given f{x), that is: for all circuits D : {0, l}'^ {0, 1} of 
size s, and for all sufficiently large k, 

PT[D{f{x)) = B{x)] < 1/2 + e 

X 

4.5.2 Hard-core Predicates Using Goldreich-Levin 

The Goldreich-Levin theorem gives us a hard-core predicate for any one-way permutation (with a 
slight modification) whose hardness is closely related to that of the original permutation: 

Theorem 19 /// is a (s, l/s)-one-way, then B{x,r) = x-r is a {s^^^\ l/s^^^)) hard-core predicate 
for the permutation f : {0, 1}^'^ {0, l}^'^, where f'{x,r) = {f{x),r). 

Proof: The proof is by contradiction: given an algorithm A that computes the hard-core predicate 
S on a 1/2 + e fraction of its inputs, we produce an algorithm that inverts / on some 0(e) fraction 
of inputs, contradicting the one-way-ness of / for a suitable choice of parameters. The reduction 
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is uniform, so we will present the proof for a uniform algorithm A; the result extends readily 
to circuits. More precisely, having a uniform reduction means that if we start with a one-way 
permutation that is hard to invert using polynomial-time algorithms (respectively polynomial-sized 
circuits), we obtain a hard-core predicate that is hard to predict using polynomial-time algorithms 
(respectively polynomial-sized circuits). 

Now, suppose we are given an algorithm A that runs in time t (or a circuit of size t for the 
non- uniform setting) such that: 

Pr[A{f{x),r) = x-r] > 1/2 + e 

Then by an averaging argument, we have: 

Pr[Fi[A{f{x),r) =x-r]>l/2 + e/2] > e/2 

X r 

Fix any x such that 

PT[A{f{x),r) = x-r] > 1/2 + e/2 

r 

Note that as we vary r over {0, 1}*^, x-r yields the Hadamard encoding of x and therefore A{f{x), •) 
yields a good approximation to this encoding. We can then recover a list of candidates for x via 
list-decoding. The precise reduction is as follows: On input f{x), 

1. Define g{r) = A{f{x),r). For an e/2 fraction of the choices of x, g{r) and Lx{r) agree on 
1/2 -I- e/2 fraction over the choices of r. 

2. Run the Goldreich-Levin algorithm using g{-) as an oracle, and with parameter e/2. This 
takes time O (^fclog k) and computes a list of size O (^). 

3. For each element x' of the list, output x' if f{x') = f{x). 

For a e/2 fraction of the choices of x, this algorithm outputs x on input f{x) with probability 
3/4. This yields an algorithm A' that runs in time O [t ■ -^klogk + ■^k'^^-^^) satisfying 

Pr[^(/(x)) = x]> 3e/8 

X 

which means / cannot be {poly{k,l/e,t),0{e))-one-wSiy. It follows that if / is (s, l/s)-one-way, 
then S is a {s^^^\ hard-core predicate for the permutation /'. □ 

Remark 1 Note that if f is one-way, then f is also one-way. 

4.5.3 Hard-core Predicates via (Efficiently) List-Decodable Codes 

Let C : {0, l}'^ {0, 1}" be a binary code, and / : {0, l}'^ {0, 1}^ be a permutation. Consider 
the predicate B : {0, l}'^ x [n] {0, 1} given by B{x,j) = C{x)j and the corresponding function 

/' : {0, 1}'^ X [n] {0, l}'^ X [n] given by f'{x,j) = {f{x),j). Extending the result in the previous 
section (which corresponds to C being the Hadamard code), it is easy to see that if / is a one- 
way permutation, and C is an efficiently list-decodable code, then B{x,j) = C{x)j is a hard-core 
predicate for /'. The reduction is as follows: 
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Suppose we are given an algorithm A that runs in time t such that: 

Fv[A{f{x),j) = Cix),]>l/2 + e 

Then, 

FT[Pi[A{f{x),j) = C{x)j] > 1/2 + e/2] > e/2 

Now, given /(x), use the hst-decoding for C to find a hst of all codewords with agreement at least 
1/2 + e/2 to the corrupted codeword whose jih. entry is A{f{x),j) in time poly{k, 1/e). 

The advantage here is that for certain choices of e, there are efficiently list-decodable whose 
block length is much shorter than that for the Hadamard code, and therefore we can construct 
hard-core predicates for one-way permutations /' whose input length is less than double that of /. 
For instance, we can concatenate Reed-Muller codes with Hadamard codes to obtain codes with 
message length k and block length 0(A;^/e^) which are efficiently list-decodable from agreement 
1/2 + e, and for which we can compute C{x)j in time poly{k, log 1/e) (to ensure that the predicate 
can be computed efficiently). For e = k^^°^^, this yields a construction of hard-core predicates 
wherein the increase in the input length for /' is only 0(log^ k). 

More generally, starting with a (s, l/s)-one-way permutation / : {0,1}'^ {0,1}'^, we can 
construct another (s, l/s)-one-way permutation /' : {0, l}'^+'^('°s*) {0,1}'^"'"'^*^'°^'') that has a 
(s^^-'^^ l/s^(^)) hard-core predicate. 

4.5.4 Pseudorandom Generators from One-way Permutations 

It is not difficult to show that if x • r is hard to compute given /(x),r, then (/(x),r, x • r) is 
computationally indistinguishable from a random {2k + l)-bit string. In particular, if / is a one- 
way permutation, then G : {0, l}^'^ — > {0, Ij^'^'+i that sends (x, r) to {f{x),r, x-r) is a pseudorandom 
generator. 

Once we obtain a pseudorandom generator that stretches the input by one bit, then it is possible 
to construct pseudorandom generators of arbitrary stretch |BM84l IYao82j . 

4.6 Goldreich-Levin and Fourier Analsys 
4.6.1 Fourier Analysis of Boolean Functions 

In the context of list-decoding of Hadamard codes, we have been looking at boolean functions 
/ : {0,1}*^ {0,1}, and linear functions La : {0,1}^ — > {0,1}, where La{x) = ®i:ai=i Xj Instead, 
we may regard / as a function / : {0,1}'^ {1,-1} C M by identifying 0,1 with 1,-1 in the 
range, and © with • (multiplication over M). In addition, consider Xa '■ {0,1}'^ {1,-1} where 

Xa(x)=a:a,=l(-ir' = 

For any /, g : {0,1}^^ —>■ M, we define the dot product f ■ g to he given by ^ X^^. /(x)5(x). 
It is then straight-forward to verify that Xa • Xa = 1, and Xa • Xfc = for a 7^ b. Therefore, 

^'^This use of hard-core predicates and one-way permutations to construct pseudorandom generators is due to Blum, 
Micali and Yao l,i3M84 Tao82 . 

^^Of course we have not formally defined computational indistinguishability nor pseudorandom generator. The 
purpose of this short section is just to give the reader a feel for the use of hard-core predicates in cryptography. The 
reader is referred to the excellent and comprehensive treatment in |Gol01| . 
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{Xa I o £ {0, 1}*^} is an orthonormal basis for the set of functions / : {0, 1}*^ — > M. This means we 
can write any function / : {0, l}'^ — > M as: 

fix) = ^ faXa{x) 

where /a = / • Xa G 

The dot product / • g and the Fourier coefficients fa have a combinatorial interpretation when 
f,g : {0, 1}^ {1, —1} are boolean functions. Observe that 

/ • 5 = 4 E fi^)9{x) = Pr[/(x) = g{x)] - Pr[/(x) / ^(x)] = 2 Pr[/(x) = ^(x)] - 1 

X 

Therefore, 

Pr[/(x)=g(x)] = ^ + i/-5 

and in particular, 

Pr[/(x) = La{x)] = \ + \fa (4) 

Also, for any / : {0, 1}^ M, we have f ■ f = Y2a fa- the case / is a boolean function, 
/ • / = 1, so this yields: 

E/' = i (5) 

a 

4.6.2 Learning the Fourier Coefficients of a Boolean Functions 

The Goldreich-Levin theorem may be interpreted as providing an efficient algorithm that given 
oracle access to some boolean function / and some threshold e, finds all linear functions La that 
agrees with / on at least 1/2 + e fraction of inputs. By Q, these linear functions correspond 
exactly to Fourier coefficients fa that are at least 2e. Furthermore, given a and oracle access to 
/, estimating fa is easy (by estimating Pr2;[/(x) = La{x)]), so we may eliminate any extraneous 
coefficients that may be computed by the Goldreich-Levin algorithm. 

This idea is often applied to learning classes of boolean functions / for which the Fourier 
coefficients are concentrated on a small set S, that is, there is some set S C {0, 1}'' such that 
I SI = poly{k) and Ylia&s fa^^~^- Given oracle access to access to /, we can define a function g 
such that / and g disagree on 0(e) fraction of inputs. Furthermore, g can be efficiently computed 
as follows: 

1. Fix t =poly{k,l/e). 

2. Find a list L of all a such that the corresponding Fourier coefficients fa are at least e/t using 
the Goldreich-Levin algorithm. From ©, there are at most 0{t^ /e^) such values. 

3. Compute an estimate of fa for all a G L by estimating Pr^[/(x) = La{x)\ using sampling. 

4. Return g = X^aeL f'aXa as an estimate for /. 

To bound the fraction of inputs on which / and g disagree, we will need to bound the errors due to 
the omission of the small Fourier coefficients, and in the estimation of the large Fourier coefficients. 

See |KM93j for this interpretation of Goldreich-Levin, and for interesting applications to learning 
theory. 
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4.7 More Hard-Core Predicates Using List Decoding 



In this section we present the results of Akavia et al. |AGS03j , which give a fresh coding-theoretic 
perspective to results that, previously, had only ad-hoc algebraic proofs. 

The techniques of Akavia et al. give, in particular, new proofs that certain predicates are 
hard-core for RSA function and for exponentiation, which are the two more studied families of 
permutations that are conjectured to be one-way. We recall their definition below. 

Definition 9 Given N = pq, p and q prime, choose e so gcd{e,<p{N)) = 1. Then the RSA 
permutation Z^r — > Z^r is 

RSAAr,e(2;) = mod N . 
Definition 10 Given p prime, g a generator for Z*, the EXP isomorphism Zp_i Z* is 

EXPp_c,(2;) = g"" modp . 

Suppose that we have a permutation p mapping Z^v into Zjv, for example RSA or exponentiation, 
and a predicate B : Z^r {0,1}, and that we would like to show that if p is one-way that B is 
hard-core. To prove such an implication we need to show that algorithm that computes B{x) given 
p{x) (for a fraction of x's noticeably larger than 1/2) can be transformed into an algorithm that 
computes x given p{x) (for a noticeable fraction of xs.) 

In order to express this reduction as a coding-theoretic problem, we asume we also have a code 
C : Z^r ^ {0, 1}" satisfying the following property. 

Definition 11 (Accessible Codes) C is accessible with respect top, B if there is a probabilistic 
polynomial time algorithm A such that 

• For X £ Zn, j e {I,... ,n}, 

A{p{x),j) =p{y) s.t. B{y) = C{x)j . 

• Over the choice of random x and j, A(p{x),j) is distributed close to uniform in Z^r. 

Intuitively, this is saying the if we are given x and j, and we are interested in computing the 
j-th bit of the codeword C{x), then we can efficiently find a string z such that B{p^~^\z)) is equal 
to C{x)j. Furthermore, if we are interested in finding C{x)j for a random x and for a random j 
then the string z will be uniformly distributed. This means that if we have an algorithm that has 
a probability 1/2 + e of computing B{y) given p{y) (the probability being over a random choice 
of y), then we also have an algorithm that has probability essentially 1/2 + e of computing C{x)j 
given x and j (the probability being over the random choice of x and j.) In particular, for at least 
an e/2 fraction of the x's we can compute C{x) on at least 1/2 + e/2 of the entries. 

We then have the following result. 

Lemma 20 Assume p is a one-way permutation, C is accessible (w.r.t. p,B), and C is list- 
decodable (from ^ + e agreement in time poly{j, log N) ). Then B is a hard-core predicate for p. 
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Proof: We only give a sketch. Assume otherwise, that on ^ + e fraction of the y, we can determine 
B{y) from p{y). Over x and j, A{p{x),j) is (nearly) uniform, so there is some | fraction of the x's 
for which ^ + | of the codeword indices j are good. The list-decoding algorithm succeeds for these 
x's. □ 

In order to prove that, for example, a certain predicate is hard-core for RSA, we need to define 
a list-decodable error-correcting that is accessible with respect to RSA and the predicate. The 
following family of codes will work in many instances. 

Definition 12 (Multiplication Code) Let B : Z^v {0, 1} be a predicate. We define the mul- 
tiplication code for B, : Zn ^ {0, 1}^^ as 

C^{x)j = B{xj mod N) . 

There are two steps now for showing that a predicate B is hard-core for a permutation p using 
this framework: show that is accessible for p, and show that is an error-correcting code. 

Clearly, the second part of the proof depends only on B, and it is independent of the particular 
permutation p that we have in mind. Perhaps surprisingly, for RSA and EXP the first part of the 
proof does not depend on B at all. 

Lemma 21 'i B, is accessible with respect to RSA and EXP. 

Definition 13 B -."L^ ^ {0, 1} is a basic t-segment predicate if in 

B{0),B{1),B{2),...,B{N -l),B{0) , 

there are < t changes of values. B is t-segment if for some invertible a S B{xa) is basic 
t-segment. 

For example, the most-significant bit, msb(2;), defined by 

msbfxl = s r / 1 

I otherwise 

is a basic 2-segment predicate. (Moreover, msb was previously known to be a hard-core predicate 
for RSA and EXP.) 

As another example, if is odd, then the least significant bit, lsb(j;) = x mod 2, is a 2-segment 
predicate. Indeed, msb(x) = lsb(2a;) - equivalently, Isb(x) = msb(x/2) = msb(x^^^^). For if 
x = 2y + b with b G {0, 1} and < y < then 

= {N+ l)y + -^b =\--\b + y . 

Notice that EXP leaks the least significant bit (whether or not g^ mod p is a quadratic residue) . 
The above argument fails since lsb(2x) = in the domain Zp_i since p— 1 is even; 2 is not invertible. 

Theorem 22 (Main Result of jAGSOS] ) Let B : Z^r {0,1} be a balanced, t-segment predi- 
cate. Then there is a list- decoding algorithm that given t and e, and oracle access to g : Z^r {0,1} 
having agreement ^ + e with C^{x), runs in time polyilog N, t,l/e) and outputs a list that with high 
probability contains x. 
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(By "balanced," we mean that B has at most a constant more, or fewer, zeros than ones. This 
condition could be weakened.) 

We get as immediate corollaries that msb is hard-core for RSA and EXP, and, in fact, every 
balanced t-segment predicate B is hard-core for RSA and EXP. 

The proof of Theorem 1221 has essentially four parts. 

1. First, the authors consider the Fourier analysis of functions of the form / : ^ C. We 
can think of codewords C^{x) as functions mapping Z^v into {0,1}, and so, in particular, 
as functions mapping Z^v into {0,1}. The authors show that if B : 'Lfq — > {0,1} is a basic 
t-segment predicate then, for every the function / correponding to the codeword C^{pc) 
is concentrated^ which essentially means that / is well-approximated by a function that has 
only few non-zero coefficients. 

2. Then the authors show that if / is a concentrated function and g is another function that 
agrees with / on a 1/2 + e fraction of inputs, then there is a Fourier coefficient that is large 
(that is, at least some value depending on e and on the "concentration" of /) for both / and 

9- 

3. The authors also show that given oracle access to a function / : Z^v — > C and a threshold r it 
is possible to efficiently find the (small) list of all the coefficients of / that are larger than r. 

4. Finally, for every fixed i-segment predicate i?, there is an algorithm that given a Fourier 
coefficient finds the (small) list of all the strings x such that that coefficient is large for the 
function corresponding to C^{x) 

Having proved all these results, the list-decoding algorithm is as follows: we first find (using part 
3 above) all the large Fourier coefficients of 5, where "large" means larger than a threshold that is 
polynomial in e, Xjt and 1/logA^. Then, for each of these coefficients, we find (using part 4) all 
the strings x such that C^[x) is large in that coefficient. 

4.8 Notes and References 

Hard-core predicates appear in the work of Blum and Micali |BM84j and of Goldwasser and Micali 
|(TM84j . and they were defined in a more general setting by Yao |Yao82j . who showed that every 
one-way permutation can be modified to have a hard-core predicate. Levin |Lev87) gives a different 
proof that uses error-correcting codes. Goldreich and Levin |(TL89j give a more efficient construction 
of hard-core predicates. As previously disccused, the Goldreich-Levin algorithm can be seen as a 
sub-linear time list-decoding procedure for the Hadamard codes. Goldreich and others jGRSOO] 
give a list-decoding algorithm for the Reed-Muller codes that runs in sub-linear time for certain 
ranges of the parameters. The algorithm is required to output a list of messages, rather than 
a list of implicit representations, and so the linear time cannot be sublinear in k. (Although it 
is sub-linear in n if n is much larger than /c.) A general connection between list-decoding and 
hard-core predicates was recognized in unpublished work in mid 1990s by Impagliazzo and Sudan. 
Kushilevitz and Mansour |KM93j recognized the connection between Goldreich-Levin and Fourier 
analysis, and its applications to learning. 

Prior to the work of Akavia and others |AGS03j . hard-core predicates for specific algebraic one- 
way permutations were proved with ad-hoc techniques. The work of Akavia and others |A(TS03j 
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combines learning, list-decoding, Fourier analysis and hard-core predicates in a very surprising 
generalization of the techniques of |(TL89j . 

Sudan and others |STV01j present sublinear time list decoding algorithms for Reed-Muller 
codes, with applications to worst-case to average-case complexity. The connection between coding 
theory and worst-case to average-case connection is further discussed in |TV02l IVio03[ ITreOSj . 



5 Locally Testable Codes 

In this section we consider codes with sub-linear time error-detection algorithms. We look for 
algorithm that are able to distinguish valid codewords from strings that are "far" in Hamming 
distance from all codewords. 

Definition 14 (Locally Testable Code (LTC)) A code C : T.^ ^ S" is {q, 6, p) -locally testable 
if there is an oracle algorithm A of query complexity q such that 

• For every message x, Pr[A'^(^) accepts] = 1 

• For every string y that has distance at least 6n from all codewords of C, Pr[74^ accepts] < p. 

This notion was introduced by Rubinfeld and Sudan |RS96| and by Priedl and Sudan |FS95j . 
and it also appears (under the name of "probabilistically checkable" codes) in Arora's PhD thesis 
|Aro94j and (under the name "checkable" codes) in Spielman's PhD thesis |Spi95 . 

Remark 2 We make a few remarks about the definition. 

• A stronger condition is to say that the code is (c,q) -locally checkable if there is an algorithm 
A of query complexity q that satisfies the first part of the above definition, and such that ify is 
a string that is at distance at least d from all codewords then Pr[yl^accepts] < 1 — cd/n. This 
means that the algorithm has, as in the above definition, a constant probability of rejecting 
strings that are at more than a certain constant minimum distance from all codewords, but 
also that it has a non-zero probability of rejecting any non-codeword, and that the rejecting 
probability grows linearly with the distance of the code. Many positive results about locally 
checkable codes prove that this stronger codition is satisfied. 

• We gave a one-sided error definition. A two-sided error definition could also be given and it 
would also make sense. 



While the notion of locally testable codes was introduced only around 1994, local testers for 
the Hadamard code |BLB,93j and for the Reed-Muller code |GLB,"'"9l] had been studied in the 
context of program self-testing, and had found their most powerful application in the construction 
of probabilistically checkable proofs. 



5.1 Probabilistically Checkable Proofs 

Definition 15 (Probabilistically Checkable Proofs) Let L be a language and V be a proba- 
bilistic polynomial time oracle machine. We say that V is a {q{n),r{n)) -restricted PCP verifier for 
L if the following conditions hold: 
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• On input x of length n, and for every oracle vr, V^lx) makes at most q{n) oracle queries and 
tosses at most r{n) random hits. 

• If X £ L then there is a string vr such that Pr[y^(a;) accepts] = 1. 

• If X ^ L then for every string vr, Pr[y'^(x)accepts] < 1/2. 

We denote by PCP[r(n), the class of languages that have [r{n),q{n)) -restricted verifiers. 

We think of vr as being a "proof" that x G L. Such a proof needs only be of length 2''"^'', 
because this is the maximum number of distinct oracle queries that the machine can make. If 
r + q = O(logn) then the proof vr is of polynomial length, and the verification process of V can 
be derandomized by running through all possible r, so every language in PCP[0(log n), 0(log n)] 
is also in NP, and the proof tt can be thought of as an NP witness for x. 

The stunning result about PCP is that every NP witness can be put in such a form that it can 
be checked with high confidence in constant time. 

Theorem 23 (PCP Theorem | A S98llALM+98 ,n NP = PCP[0(log n), 0(1)] . 

Rather than trying to survey constructions, applications and ideas in the area of PCP^^, I will 
discuss a recent development that will hopefully lead to a simpler proof of the PCP theorem and 
possibly to locally testable codes and PCPs of optimal length. 

5.2 PCPs of Proximity 

Consider the following definition. 

Definition 16 (PCP of Proximity |BSGH+04L IDR04 p An {r{n),q{n)) -restricted PCP of 6- 
Proximity for an NP relation R is a probabilistic polynomial time oracle algorithm A such that 

• // {x,y) € R, then there is a tt such that Pr[A^''^(x)accepts] = 1. 

• // Pr[74^''^(a;)accepts] < 1/2. then y is 6-close to a y' such that {x,y') G R. 

A PCP of proximity is the same as a standard PCP, except for the fact the proof is composed 
of two part: a standard witness y and a possibly very complex component it. When the verifier 
accepts with high probability, then not only it has confidence that a witness exists for input x, but 
actually it has confidence that y itself is close to a witness. 

The definition becomes more clear, perhaps, when specialized to a particular NP problem, say, 
circuit satisfiability. 

Definition 17 (Assignment Tester) An assignment tester is a PCP of proximity for the circuit 
satisfiability problem. Formally, a {6,r(n), q{n)) Assignment Tester is a probabilistic polynomial 
time oracle algorithm A such that 

• If C is a circuit and a is an assignment such that C{a) = 1, then there is a n such that 
Pr[^'^''^(C)accepts] = 1. 

• // Pr[A"''^(C)accepts] < 1/2. then a is 6-close to a a' such that C{a') = 1. 

^**A task covered by many survey papers, although my favorite introduction to the area is the introduction of a 
research paper by Bellare et al. |BGS98| . 
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5.3 Relations between PCPs of Proximity, PCP, and Locally Testable Codes 

We have already observed that PCP of Proximity is only a stronger algorithm than a PCP verifier. 
The randomness, query complexity and completeness constraints are the same. The soundness 
constraint in the definition of PCPP not only implies that x is in the language, but it also implies 
the stronger property that there is a witness that x £ L which is close to the initial segment of the 
oracle proof. 

It is less trivial, but still simple, to get a LTC from an AT. Let C be an error-correcting code 
, and let C be a circuit that checks whether a given string is a codeword of the code, let V be an 
assignment tester for C. Then for every codeword C{x) there is a proof tTx such that C{x),7r{x) is 
accepted with probability 1 by the assigment tester V. Suppose now that (y, w) is accepted with 
probability higher than 1/2 by V: then y is close to a valid codeword C{x). If we could argue that 
is close to {C{x),tTx), then we would have shown that the mapping x — > {C{x),'irx) is a good 
error-correcting code. Unfortunatelty, if the length of the proof is large compared with the length 
of the assignment, then it is not possible to conclude that is close to {C{x),Trx) just because 

y is close to C{x). This is indeed a problem because in all known constructions the length of the 
proof is super-linear in the length of the assignment. 

This problem is resolved by considering the mapping x {C{x),7rx) where C{x) is a sequence 
of several identical copies of C{x), enough copies so that the total length is equal to about 0(1/6) 
times the length of y. 

Our tester is given a string {yi, . . . ,yk,'w), where the yi are supposed to be identical copies of the 
same codeword and w is supposed to be the proof tTx that C{x) is a valid codeword. The tester 
first checks that strings are approximately equal. This is done by repeatedly picking a pair i,j 
and then comparing the strings yi,yj in a random position. Then the tester simulates the algorithm 
of the assignment tester. Each query of the assignment tester to the assignment part of the oracle 
is randomly routed to one of the strings y^. Clearly a valid codeword is accepted with probability 1. 
The analysis is completed by arguing that if the test accepts with high probability (say, larger than 
3/4), then the strings yi are approximately all equal and that a "majority" decoding into a single 
string y is such that y,C{x) would have been accepted by the assignment tester with probability at 
least 1/2. So y is close to a valid codeword, and yi, - ■ ■ ,yk is close to a valid repetition of a valid 
codeword. Details are in the full version of |BSGH+04 . 

From |SS96l Spi96 , we know that there are error-correcting codes C : {0, l}'^ — > {0, 1}*^ with 



constant relative minimum distance, for which n = 0{k) and such that there is a circuit of size 0{k) 
that recognizes valid codewords. If there were assignment testers with constant query complexity 
and linear proof length, then the above argument would show the existence of an LTC with block 
length 0{k), that is, an asymptotically optimal LTC. 

Assignment testers with linear proof length are not known, and the best known construction is 
as follows. 

Theorem 24 rf BSGH+04 |l For every constant e there is a constant q such that there is an 
Assignment Tester of query complexity q that, for a circuit of size n, uses O(logn) randomness 
and expects a proof of length 2'^((^°s")') . 

We remind the reader that results from [KTOOj imply that a locally decodable code with a 
decoder having constant query complexity cannot have encoding length /c^+°(^) , so the above result 
already shows a separation between the rate of LTCs versus LDCs with comparable parameters. 

The main open question is clearly 
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Open Question 4 Are there LTCs with constant query complexity and constant rate? 

Similarly, we could ask if there are assignment testers with proofs of linear length. If there 
were an assignment tester with logarithmic randomness, constant query complexity and a proof of 
linear length, then there would a randomized reduction from, say, 3SAT to the Max CUT problem. 
Starting from a 3SAT instance with n variables and 0{n) clauses, the reduction would produce an 
instance of Max CUT with = 0{n) nodes and M = 0{n). For some fixed constants p and e, a 
satisfiable 3SAT instance would produce a graph where the size of the maximum cut is at least p; 
an unsatisfiable instance of 3SAT would produce a graph where the size of the maximum cut is at 
most p{l — e). Consider now the following question. 

Open Question 5 Is it possible to approximate the Max CUT problem in bounded- degree graphs 
to within a factor 1 + o(l) in time 2°^"^ ? 

A positive answer would imply that an assignment tester like the one discussed above could be 
used to get a 2°^"^ algorithm for 3SAT, a conclusion that is typically considered to be unlikely. A 
positive answer to Question [3 could then be taken as evidence that assignment testers need proofs 
of super-linear length. 

5.4 Notes and References 

The PCP Theorem was the culmination of a long line of collaborative work, that is difficult to 
summarize. 

In telling this story, one typically starts from the introduction of the model of "interactive proof 
systems" due independently to Goldwasser, Micali and Rackoff |iGMR89j and to Babai |Bab85j . In 
this model, a probabilistic verifier interacts with a prover, as opposed to receiving a fixed proof 
and checking its validity. The work of Goldwasser, Micali and Rackoff 'GMRSO also introduces 
the notion of ^^zero-knowledge proof system," which later became a fundamental primitive in the 
construction of cryptographic primitives. A fundamental result by Goldreich, Micali and Wigder- 
son [GMWQI] shows that every problem in NP has a zero-knowledge proof system, assuming that 
a certain cryptographic assumption is true. Ben-Or et al. |B()(TKW88j considered a model of zero- 
knowledge where the verifier can interact with two (or, more generally, several) provers, who are all 
computationally unbounded but unable to communicate with each other once the protocol starts. 
The contribution of BOGKW88. was to show that every problem in NP has a zero-knowledge 
proof system in this model, without cryptographic assumption. The model of multi-prover inter- 
active proof (without the zero-knowledge requirement) was further studied by Fortnow, Rompel 
and Sipser (FR.S88) . They show that the class of languages admitting such proof systems has the 
following equivalent characterization: it can be seen as the class of languages that admit exponen- 
tially long proofs of membership that can be checked in polynomial time by a randomized verifier 
(with bounded error probability). This class is clearly contained in NEXP, where NEXP is the 
class of decision problems that admit exponentially long proofs that can be checked in exponential 
time in the length of the input (but, without loss of generality, in polynomial time in the length of 
the proof itself). 

Initially, it was conjectured that MIP was only a small extension of NP, and that coNP ^ MIP. 
Shortly after Shamir's proof that IP = PSPACE |Sha92| . Babai, Fortnow and Lund |BFL91j 
showed that MIP = NEXP. This is a truly impressive result: it says that for every language that 
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admits exponentially long proofs, such proofs can be encoded in such a way that a polynomial- 
time randomized verifier can check them. The verifier will accept correct proofs with probability 
1, and "proofs" of incorrect statements with probability < 1/2 (or, equivalently, with probability 
exponentially small in the length of the input). So the verifier becomes convinced of the validity of 
the proof even if it only looks at a negligible part of the proof itself. 

It is natural to ask whether polynomially long proofs can be checked in polylogarithmic time. 
This question has to be phrased carefully, since a polylogarithmic time verifier cannot even read the 
instance, which makes it impossible to verify a proof for it. However if both the instance and the 
proof are encoded in a proper (efficiently computable) way, then Babai et al. show that polyloga- 
rithmic time verification is possible [BFLSQl] . A variant of this result was also proved by Feige et 



al. FGL^91 : they show that NP-proofs have a quasi-polynomial length encoding (i.e. an encoding 
of length n'-^(^"siogn)\j g^^]^ that a polynomial-time verifier can verify the correctness of the proof in 
polynomial time by using O(lognloglogn) random bits and reading O(lognloglogn) bits of the 
proof. The main result of Feige et al. |FGL+9H was to show a connection between the computa- 
tional power of such a model and the hardness of approximating the Max Clique problem. The 
result of Feige et al. FGL+9lJ can be written as NP C PCP[0(lognloglogn), O(lognloglogn)]. 

Arora and Safra jAS98j introduced several new ideas to improve on jFrTL+9lj . and proved 
that NP = PCP[0(logn), 0(\/Iogn)]. The main contribution of Arora and Safra is the idea of 
"composing" proof systems together. The next step was to realize that the reduction from POP 
to Max Clique was not an isolated connection betweeb proof checking and approximability. Sudan 
and Szegedy (as credited in ALM"'"98] ) discovered that the computations of a (O(logn), 0(1))- 
restricted verifier can be encoded as instances of the Max 3SAT problem. Then, using the web 
of reductions between optimization problems initiated by Papadimitriou and Yannakakis |PY91j . 
this also implies that the strength of (O(logn), 0(l))-restricted verifiers implies the hardness of ap- 
proximating several important problems including the Traveling Salesman Problem and the Steiner 
Minimal Tree problems in metric spaces. This was a strong motivation to prove the PCP Theorem, 
that came only a few months after the initial circulation of the paper of Arora and Safra |AS98j . 

Further work on strengthening relation bewteen query complexity, error probability and other 
parameters, and improve hardness of approximation. In this section we have concentrated on the 
problem of proof length and construction of locally testable codes, a question that has received 
relatively less attention until recently. 

Locally testable codes were discussed in many places, including |Aro94| rSpi95 IFS95j . Con- 
structions of short PCPs were first presented by Polishuck and Spielman |PS94j . Friedl and Sudan 
|FS95j construct both short PCPs and good locally testable codes, and further improvements are 
due, more recently, to Harsha and Sudan |HSflnj . 

Goldreich and Sudan |GS02j give a nearly-linear length construction of locally testable codes 



and PCPs. The result of |(TSn2j is based on a probabilistic construction, and so the codes are 
not computable in polynomial time, although they can be computed by polynomial size circuits. 
Similarly, the verifier in their PCP construction can be realized by a polynomial size circuit but not 
by a uniform machine. The results of Goldreich and Sudan have been improved and made explicit 
in |BSSVWn3| lBSGH"'"'n4 . There is no lower bound for locally testable codes, except the one in 



|BrfG^n3| for a very special case. 

Dinur and Reingold have recently made considerable progress towards a simpler and more 

^^A clique in a graph is a subset of vertices that are all pairwise adjacent. The Max Clique problem is, given a 
graph, to find the largest clique. 
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"combinatorial" proof of the PCP theorem |DR04j . a direction of work on which there is essentially 
no previous result, except for interesting work by Goldreich and Safra [(irSnnaj . 
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A Appendix 



A.l The Berlekamp- Welch Algorithm 

Consider the following algorithm: 

1. If there is a polynomial p such that p{xi) = yi for alH = 1, . . . , n, output p. Otherwise: 

2. Find polynomials E{x) and N{x) such that 

(a) E is not identically zero; 

(b) E{x) has degree at most e and N{x) has degree at most e + k — 1\ 

(c) For every z = 1, . . . , n, N{xi) = E{xi) ■ yi. 

3. Output N{x)/ E(x), or output error if N(x) is not a multiple of E{x). 

We claim that the algorithm can be implemented to run in O(n^) time and that it correctly 
finds the unique solution. 

Let p be the unique solution, and let I = {i : p{xi) / yi}. If / is empty, then the algorithm finds 
p in step (1). We want to show that, if I is not empty, then steps (2) and (3) can be implemented 
in O(n^) time and that the algorithm finds p in step (3). 

Regarding efficiency, polynomial division can be realized in almost linear time, so we only need 
to worry about step (2). We can write E{x) = Yli=o^i^^ N{x) = Yli=o~^bi^^^ ^^'^ the 
problem of realizing step (2) of the algorithm as the problem of finding coefficients and hi such 
that the constraints N{xi) = E{xi)yi are satisfied. Such constraints are linear in Oj and 6^, and 
so, if the set of constraints has a non-zero solution, then a non-zero solution can be found in cubic 
time using Gaussian elimination. 

To see that a non-zero solution exists, let us define the polynomials E{x) = Yiieii^ ~ ^^'^ 
N{x) = E{x) Then by definition the degree of E is at most e, and the degree of is at most 

k — 1 + e. Furthermore, if i G / we have E[xi) = and N{xi) = 0, so that N{xi) = E(xi)yi = 0; 
if i / we have N(xi) = E{xi)p{xi) = E{xi)yi, and so all the constraints are satisfied. Finally, / 
is not empty (otherwise we would have found p at step (1) of the algorithm) and so E is not the 
all-zero polynomial. 

Regarding correctness, let E, N be the polynomials defined above, and let E' , N' be the solution 
found by the algorithm in step (2), we want to argue that N{x) / E{x) = N' (x)/ E' (x), which is 
the same as N{x)E'{x) = N'{x)E{x). The polynomials N{x)E'{x) and N'{x)E{x) have degree at 
most 2e + k — 1 < n, and so, to show that they are equal, it is enough to show that they agree in 
n inputs. This is easily verifier because, for every z = 1, . . . ,n, we have 

N{xi)E'{xi) = yiE{xi)E'{xi) = N'{xi)E{xi) 

A. 2 List-Decoding of the Reed-Solomon Code 
A. 2.1 A Geometric Perspective 

For the purpose of this algorithm, we will want to describe the n given points using low-degree 
planar curves that pass through them; that is, we consider curves {(a;,y) : Q{x,y) = 0} where 
Q{x, y) is a low-degree polynomial. Note that we are not restricted to curves with degree one in y; 
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in particular, we may describe points on a circle centered at (0, 0) with the equation — 1 = 0. 

Other examples of point sets that may be described using low-dimensional curves are lines, and 
unions of lines and circles. 



A. 2. 2 A Simple List-Decoding Algorithm 

For the list-decoding problem, the intuition is that if p is a polynomial with large agreement, then 
the curve y — p{x) = passes through many of the given points. Therefore, what we will do in 
the list-decoding algorithm is to first find a low-degree polynomial Q passing through all of the 
given points, and then show that all low-degree curves that pass through many of the given points 
divides Q. This reduces the list-decoding problem to factorizing a bivariate polynomial over a 
finite field, for which efficient algorithms do exist. 

Algorithm List-Decode-RS 

Given: n distinct points (xi, yi), . . . , {xn, yn) in ^q- 

1. Find Q{x,y) such that 

• Q has low degree: dx — 1 in x, dy — 1 in y 

• Q{xi, yi) = for alH = 1, 2, . . . , n 



2. Factor Q{x, y). For every factor of the form y — p{x), if p is a feasible solution, output p. 

There are two problems that we need to address: 

1. Does there exist a low-degree polynomial Q that pass through all the given points, and if so, 
how can we find one efficiently? 

2. Must every low-degree polynomial that pass through many of the given points divide Q? 
For instance, taking t = 3 for concreteness; it seems conceivable that we have a polynomial 
R{x,y) quadratic in y that passes through 6 of the given points that lie on y — pi{x) , y — p2{x) 
for two low-degree polynomials pi,P2, and that R{x,y) divides Q, but neither y — pi{x) nor 
y — P2{,x) does. 




In the example on the left, we have a set of 13 points that lie 
on the union of a circle and two lines. Suppose the point in 
the center is (0,0). Then, the set of points lie on the curve 
described by: {x^ + y'^ ~ l){x — y){x + y) = Q. 



A. 2.3 Finding Q 



First, we address the problem of finding Q. We may write 






i=Q,...,dx — i;3=0,...,dy — \ 
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Now, the problem reduces to finding the dxdy coefficients of Q: cij,i = 0, 1, . . . , — = 
0, — 1. Observe tliat the requirement Q{xi,yi) = is equivalent to a system of linear 

constraints on the coefficients {%■}. Furthermore, this is a homogeneous system, so it will always 
have the all O's solution, corresponding to Q = 0. On the other hand, if dxdy > n, that is, the 
number of variables is more than the number of linear constraints, then we can always efficiently 
find a non-zero solution to the linear system that yields a non-zero Q that passing through all the 
n points. 

A. 2.4 Proof of Correctness 

Next, we will have to show that every polynomial p with large agreement with the points 
(xi, yi), . . . , (x„, Un) is a factor of Q. More precisely, we are told that: 

1. p{x) is a degree k — 1 polynomial such that y — p{x) is zero in at least t of the points. 

2. Q{x,y) has degree d^ — 1 in x and dy — 1 in y and passes through all the points (that is, 
Qixi, i/i) = for i = 1, 2, . . . , n). 

3. There are > t points {xi,yi) such that Q{xi,yi) = yi — p{xi) = 0. 

For simplicity, we can rewrite these conditions assuming that we are choosing n points on the 
curve Q{x,y) =0, which yields the following statement: 

Proposition 25 Suppose that 

1. Q{x,y) is bivariate polynomial in x,y with degree dx — 1 in x and dy — 1 in y. 

2. p{x) is a degree k — 1 polynomial in x. 

3. There are > t points (xj, yi) such that Q{xi, yi) = yi— p{xi) = 0. 

I t> {dx-l) + ik-l){dy-l). 

Then y — p{x) divides Q{x, y). 

This proposition is a special case of Bezout's Theorem, that says that any two curves that 
share lots of points in common must share a common factor. Here, y — p{x) is irreducible (over 
polynomials in y with coefficients from Fq[x]), so it divides Q{x,y). A simple proof of this special 
case is shown below. 

It is also important to note that we only require that the points {xi,yi), . . . , (x„, ?/„) be distinct, 
and not that xi, . . . , x„ be distinct, as in the case for list-decoding Reed-Solomon codes. This allows 
the list-decoding procedure to be used in a more general setting, as we shall see later. 

Proof: View Q is a univariate polynomial in y whose coefficients are univariate polynomials in x: 

q{y) = qo{x) + yqi{x) + ...+ /^-^gd^_i(x) 

Recall the Factor Theorem for polynomials: /? is such that q{/3) = iff y — /? divides q{y)- This 
tells us that p{x) is such that q{p{x)) = iff y — p{x) divides Q{x,y). Therefore, to show y — p{x) 
divides Q{x,y), it suffices to show that Q{x,p{x)) is the zero polynomial. 
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From condition 3, we know that Q{xi,p{xi)) = for at least t distinct values of the Xj's. On 
the other hand, Q{x,p{x)) as a univariate polynomial in x can be written as: 

Q{x,p{x)) = qo{x) +p{x)qi{x) + . . . + p{x)^'"~^qdy-i{x) 

and has degree at most {dx — l) + {k — l){dy — 1). Therefore, if i > {dx — 1) + {k — l){dy — 1), then 
Q{x,p{x)) = and y — p{x) divides Q{x,y). □ 

A. 2. 5 Fixing the Parameters 

We are now ready to fix the parameters dx,dy. Recall that we require that: 

1. dxdy > n, so that we have sufficient variables in the linear system for finding Q; 

2. t > dx + kdy, to ensure that every polynomial with large agreement is a factor of Q. 

We want to maximize t under both constraints, and that is optimized by setting dx = \fkn and 
dy = y^n/k, so dx + kdy = 2\fhn. As a polynomial in y, Q has degree dy and therefore at most 
dy factors. Hence, there are at most dy = y/njk polynomials in the list. This yields the following 
results: 

Theorem 26 Given a list of n points (xi, yi), . . . , y„) in F^, we can efficiently find a list of 
all polynomials p{x) of degree at most k — 1 that pass through at least t of these n points, as long 
as t > 2\fnk. Furthermore, the list has size at most y/njk. 

Theorem 27 For every e > 0, and for all sufficiently large n, there exist: 

1. A [ra, en, (1 — e)n]„ Reed-Solomon code, such that we can efficiently list- decode from agreement 
in at least 2y^n locations, and size of the list is at most a/I/c. 

2. A [n, e^n/4, (1 — e^/4)n]„ Reed-Solomon code such that we can efficiently list-decode from 
agreement in at least en locations, and the size of the list is at most 2/e. 

A. 2. 6 Increasing the List-Decoding Radius 

Observe that in the proof of correctness, we only require that Q{x,p{x)) has degree less than t in x. 
Therefore, it suffices that for all monomials x^y^ in Q{x, y), we have i-\-kj <t (instead of the more 
restrictive constraint that i < t/2 and j < t/2k). This means that we may consider any Q{x,y) of 
the form: 

Q{x,y) = ^ CijxY 

i+kj<t 

Therefore, the number of coefficients (and thus the number of variables in the linear system) is 
given by: 

t/k 

' ' 1 t^ 

I {(i, j) : i + fej < i} 1= i + (i - fc) + (i - 2fe) + . . . + (t - ^ • fc) = ^ • + 0) = ^ 
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(instead of t/2 ■ t/2k = |^ if we consider only i < t/2 and j < t/2k.) To ensure that the hnear 
system {Q{xi,yi) = | i = 1, 2, . . . , n} is under-determined, we need |^ > n, or equivalently, 
t > V2kn. For such t, it suffices to consider Q of the form: 

Q{x,y)= ^ CijX^y^ 

i+kj<t\j<-^2n/k 

This aUows us to place an upper bound of \/2njk on the size of list (instead of the crude bound 
tjk). 

Theorem 28 Given o, list of n points (xi,yi), . . . , {xn,yn) in-'^q, we can efficiently find a list of 
all polynomials p{x) of degree at most k — 1 that pass through at least t of these n points, as long 
as t > V2nk. Furthermore, the list has size at most y/2n/k. 
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